Yahoo's Index is over 50 Times Bigger than Google's
Yahoo!'s recent announcement that its index contains 20 billion pages stirred a lot of controversy. Yahoo!'s claim would put its index at over twice that of competitor Google that advertises its index to comprise of just over 8 billion pages. However, according to a quick bare-bones study that I conducted taking into account Alexa's top 100 anglophone websites.
The analysis was simple and seems to substantiate other claims for specific verticals such as the blogging vertical. I compared the number of referring links for all top 100 sites in Google, Yahoo, and MSN. The premise is trivial--the total number of actual indexed pages is proportional to the number of referring links in each index. (This would not hold true if, say, Google didn't report all of their referring links while Yahoo! did.)
The results substantiated the suspicion that I and other have had, that Google is misrepresenting the size of its index. Indeed, this study points to this assertion with compelling numbers. Granted, a complete and unbiased study should involve a random sample of sites that is large enough to draw statistically significant conclusions. Even so, the numbers in this rudimentary analysis are so convincing that not only does it seem that Yahoo!'s index is larger, but it is very likely larger by at least one if not two orders of magnitude.
For the top 100 anglophone websites as dictated by Alexa, the ratios are:
- Yahoo! index to Google index = 51.0 to 1
- MSN index to Google index = 6.5 to 1
- Yahoo! index to MSN index = 7.8 to 1
I have started to analyze a random sample of sites for a more definitive conclusion, but was recently limited by time. I hope to have such as study published on this site soon with all accompanying regression analysis.
Click here for the spreadsheet of referring links analysis.
