Commentary
Google Offers Peek At How It Controls Search Quality
Google's only goal: Improve user experience. How does it do that? According to Udi Manber, VP of engineering at Google, Search Quality, it is a heck of a lot of work. Google improves its search algorithms an average of nine times per week. Here's why.Google's only goal: Improve user experience. How does it do that? According to Udi Manber, VP of engineering at Google, Search Quality, it is a heck of a lot of work. Google improves its search algorithms an average of nine times per week. Here's why.Google's Udi Manber published a massive blog post about what exactly is going on behind Google's closed doors. It is an interesting read. I invite you to check the entire post out here.
Below are some points I thought were most interesting.
More Internet Insights
White Papers
- Creating the Enterprise-Class Tablet Environment - by Yankee Group
- How To Regain IT Control In An Increasingly Mobile World - by BlackBerry
Reports
- How Google+, Facebook Impact Corporate Strategy: Social Media and IT at a Crossroads
- Strategy: Enterprise Social Network Buyer's Guide
Webcasts
- Maximize ROI with Database Consolidation onto Private Clouds
- Outsourcing Security: What Every Potential Cloud Security Customer Should Know
Search Quality is the name of the team responsible for the ranking of Google search results. Our job is clear: A few hundreds of millions of times a day people will ask Google questions, and within a fraction of a second Google needs to decide which among the billions of pages on the Web to show them -- and in what order....It's amazing that Google can roll out so many updates to its core product with essentially no one (okay, except for maybe Google engineers) noticing. Manber goes on to talk about Google's focus on International search, and its dedication to new features and new user interfaces. Let's not forget its spam prevention team.For something that is used so often by so many people, surprisingly little is known about ranking at Google. This is entirely our fault, and it is by design. We are, to be honest, quite secretive about what we do. There are two reasons for it: competition and abuse. Competition is pretty straightforward. No company wants to share its secret recipes with its competitors. As for abuse, if we make our ranking formulas too accessible, we make it easier for people to game the system. Security by obscurity is never the strongest measure, and we do not rely on it exclusively, but it does prevent a lot of abuse.
The details of the ranking algorithms are in many ways Google's crown jewels. We are very proud of them and very protective of them. By some estimate, more than one thousand programmer/scientist years have gone directly into their development, and the rate of innovation has not slowed down...
The heart of the group is the team that works on core ranking. Ranking is hard, much harder than most people realize. One reason for this is that languages are inherently ambiguous, and documents do not follow any set of rules. There are really no standards for how to convey information, so we need to be able to understand all web pages, written by anyone, for any reason. And that's just half of the problem. We also need to understand the queries people pose, which are on average fewer than three words, and map them to our understanding of all documents. Not to mention that different people have different needs. And we have to do all of that in a few milliseconds.
The most famous part of our ranking algorithm is PageRank, an algorithm developed by Larry Page and Sergey Brin, who founded Google. PageRank is still in use today, but it is now a part of a much larger system. Other parts include language models (the ability to handle phrases, synonyms, diacritics, spelling mistakes, and so on), query models (it's not just the language, it's how people use it today), time models (some queries are best answered with a 30-minutes old page, and some are better answered with a page that stood the test of time), and personalized models (not all people want the same thing).
Another team in our group is responsible for evaluating how well we're doing. This is done in many different ways, but the goal is always the same: improve the user experience. This is not the main goal, it is the only goal. There are automated evaluations every minute (to make sure nothing goes wrong), periodic evaluations of our overall quality, and, most importantly, evaluations of specific algorithmic improvements. When an engineer gets a new idea and develops a new algorithm, we test their ideas thoroughly. We have a team of statisticians who look at all the data and determine the value of the new idea. We meet weekly (sometimes twice a week) to go over those new ideas and approve new launches. In 2007, we launched more than 450 new improvements, about 9 per week on the average...
Manber says this introduction is going to be followed up by more posts that explain how Google attains its search quality, though he didn't mention how frequently.
It is Google's dedication to improving search that has placed it at the top of the heap. It's clear that Google takes the idea of search very seriously, and is committed to helping people find what they need as fast as possible.
Related Reading
| To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy. | |
|
|
T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting! |
Subscribe to RSSResource Links
This Week's Issue
Technology Whitepapers
- Mobile BI: Actionable Intelligence for the Agile Enterprise
- Creating the Enterprise-Class Tablet Environment - by Yankee Group
- The BlackBerry PlayBook tablet's Good Bones - by BlackBerry
- Red Alert: Why Tablet Security Matters - by BlackBerry
- New Visual and Wizard-Driven Paradigms for Exploring Data and Developing Analytic Workflows
Featured Resource
Download this whitepaper and find out how to easily manage web content by categorizing it into a discrete number of categories.
Learn More












