Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Top 1 million URLs in the world [warning: 44MB text file] (twitter.com/mikko)
2 points by r721 on June 21, 2013 | hide | past | favorite | 3 comments


From what I understand the list is part of the ChromeBot Tool, a sub-project of Chromium.

   ChromeBot is a distributed crash detection system.

   Chrome is run on a list of URLs to check for crashes.  A 
   pool of machines is used to distribute the workload, 
   each machine might potentially launch several instances 
   of Chrome. Crashes detected are symbolicated and saved 
   in database.
http://src.chromium.org/svn/trunk/tools/chromebot/README.txt


Hard for me to believe Orkut is so high on the list.


The tweet said only "The Top 1 million URLs in the world", not that the list is ordered in any particular way. I doubt the list is sorted.

For example, the first URL with a German domain (.de) -and if we ignore google - is

http://suchen.mobile.de/fahrzeuge/showDetails.html

Never heard of that one, spiegel.de and amazon.de should be much more popular. The reason that google in the first entries might be that about 2.3% of all URLs in the list are google or facebook domains.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: