Google, Microsoft, Yahoo and others are developing next-generation technologies that automate and personalize information search.

J. Nicholas Hoover, Senior Editor, InformationWeek Government

August 4, 2007

3 Min Read

RESULTS ORIENTED
"Who said an edit box and 10 blue links is what search is?" asks Microsoft's Nadella. It's a good question, but one that becomes less relevant in the new world of search. Search results are being expressed in new ways, from automated clustering and categorization to actual answers to questions. Type "Seattle traffic" in Microsoft's Live Search and a map pops up with highways color coded by how fast traffic is moving. Likewise, type "Abraham Lincoln's birthday" in Google and the first result shows the actual date--Feb. 12, 1809--above a list of related URLs.

Vivisimo, which also runs a consumer search engine called Clusty, reads through the text of Web pages and creates categories on the fly from the top 200 returned documents by using semantic understanding. Vivisimo's Clustering Engine determines that concepts such as "pretty" and "gorgeous" are related, then groups results into categories based on such commonalities. "Themes help people contextualize data and give them some kind of framework for how information is organized," says Rebecca Thompson, the company's VP of marketing.

Computer-generated clustering is especially important in business environments, where users can't rely on how popular a site is to provide a sense of relevance. Like Vivisimo, Endeca performs automatic categorization, using "guided navigation," based on the theory that people aren't always searching for something specific but instead are looking to discover something they don't explicitly know how to ask for.

Home Depot's Endeca-powered Web site shows how that works in practice. A search for "fridge" generates buckets like category, price, and brand, each of which can be narrowed. The categories are populated based on metadata about each item. "The future vision is where information summarizes out to the way you want it to look," says Matt Eichner, Endeca's VP of strategic development and marketing.

Factiva searches use technology from Fast Search & Transfer to find everything published on blogs and media sites about a brand, categorize coverage as favorable or unfavorable, quantify it, and plot a line graph that shows how perceptions change over time.

Another early example of using a search engine to gather new knowledge is Google Trends, a Google Labs project that will show searchers, say, that interest in Lake Tahoe and skiing spikes about the same time. "What if computers could understand more about the world?" Cutts asks. "If you solve that, you can really understand more about what people are searching for."

MULTIFACETED
Today's Web search engines can sift through HTML files, PDFs, Office files, and audio, video, and image metadata. Tomorrow's engines will search images, audio, and video directly--without relying on metadata--and include them with other results. "You're not going to see separate systems for audio, video, and text," says Autonomy CEO Lynch.

Google's universal search is an early start in this direction, though the relevance models for different data types don't always gel. Other signs of progress: Autonomy's technology can detect scene changes and divvy video into searchable chapters. And Autonomy, Sonic Foundry, and Nexidia can all search the voice tracks of video or audio.

Like.com, which sells clothing and accessories, is an example of where image search is heading. A Likeness Search feature at the site gives users sliding scales to designate preferences for color, shape, and pattern. Microsoft and Google have both developed technology that can search for faces.

Still, image search is far from being on an equal footing with text, says IBM's Moran. People will be adding text tags to images and videos for a long time before search engines get good at looking at pictures and describing them in words.

Yet search innovations keep coming--driven largely by necessity. As petabyte after petabyte of information accumulates on the Web and in corporate databases, the tools for finding what we need must change, too.

Illustration by Richard Borge

Continue to the sidebar:
Business Search Goes Beyond The Browser

About the Author(s)

J. Nicholas Hoover

Senior Editor, InformationWeek Government

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights