Google Index Reaches 1 Trillion URLs

Three years after Google declared that its index was <a href="http://battellemedia.com/archives/001889.php">three times larger</a> than any other search engine and then declined to cite a specific number to support that claim, it was widely believed that Google had tired of index one-upmanship and that it would no longer be measuring its index.

Thomas Claburn, Editor at Large, Enterprise Mobility

July 25, 2008

2 Min Read

Three years after Google declared that its index was three times larger than any other search engine and then declined to cite a specific number to support that claim, it was widely believed that Google had tired of index one-upmanship and that it would no longer be measuring its index.Well, Google has its yardstick in hand once again.

Two Google engineers on Friday said that Google's index of the Web now contains 1 trillion unique URLs.

That's a lot of URLs. However, there's a lot of chaff in there. Consider that "Google" alone returns 2,740,000,000 Google search results, "Yahoo" returns 2,930,000,000, and "eBay" returns 1,080,000,000. Add up the number of results generated by searching for the top 100 keywords and you'd have a significant fraction of 1 trillion.

In 1998, when Google opened for business, it had 26 million URLs. By 2000, it had reached 1 billion. In 2005, Google claimed it had more than 8 billion Web URLs in its index, at least until it took the index count off its home page. In 2008, Google's measure of the Web is 1 trillion Web URLs.

So it appears that Google's index is exploding with new Web pages. From 2000 to 2005, Google's index grew by a factor of 8. From 2005 to 2008, it grew by a factor of 125.

There you have evidence of the information explosion that Google and other companies are trying to fight through the Information Overload Research Group.

Maybe.

Google software engineers Jesse Alpert and Nissan Hajaj admit that they don't really know how many unique Web pages there are. And they acknowledge that Web URLs are essentially infinite because dynamic page generation for things like future calendar months means there's always another page to crawl. "But we're proud to have the most comprehensive index of any search engine," they say in a blog post.

So forget the math. The numbers are too slippery. Really, this is about bragging rights.

About the Author(s)

Thomas Claburn

Editor at Large, Enterprise Mobility

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful master's degree in film production. He wrote the original treatment for 3DO's Killing Time, a short story that appeared in On Spec, and the screenplay for an independent film called The Hanged Man, which he would later direct. He's the author of a science fiction novel, Reflecting Fires, and a sadly neglected blog, Lot 49. His iPhone game, Blocfall, is available through the iTunes App Store. His wife is a talented jazz singer; he does not sing, which is for the best.

See more from Thomas Claburn

Related Topics

Recent in Leadership

Related Topics

Recent in Resilience

Related Topics

Recent in ML & AI

Related Topics

Recent in Data

Related Topics

Recent in Sustainability

Related Topics

Recent in Infrastructure

Related Topics

Recent in Software

Related Topics

About the Author(s)

Editor's Choice