The search engine, that little browser tool into which you type a phrase, hit Enter, and hope for the best, is notoriously inefficient, often returning millions of off-the-mark URLs. People search for 11 minutes on average before finding what they're looking for, and half abandon searches without getting that far, according to Microsoft. By Gartner's estimate, half of potential Web sales are lost because visitors simply can't find what they want.
Google, Microsoft, Yahoo, and dozens of search specialists, including those catering to business customers, are racing to develop next-generation technologies that do a better job of getting people the information they seek. With emerging tools, people will no longer have to dumb down their queries with the pidgin language understood by first-generation search engines. They'll be able to ask questions in English and other languages--or pose no question at all and automatically receive results based on their earlier queries or the applications they're using.
The results users do get will include audio and video files, PowerPoint slides and other infographics, and structured data--all in one stream of results culled from the Web, PCs, and company databases. Over time, image searches will even detect information in the image itself, rather than by parsing metadata.
Search results will be more accurate and automatically summarized, with relevance determined by individual preferences. New methods of presentation such as clustering, tag clouds, graphical scales that widen or narrow searches based on parameters, and automated categorization will make it easier to navigate results. And search engines will be enhanced by human intelligence and the wisdom of crowds through tagging, social bookmarking, and shared searches.
We won't have to wait long for some of these souped-up search engines. The following advanced capabilities are beginning to surface in a variety of places.
Most of today's search engines require a shorthand language some describe as keywordese. "It's kind of like talking to a 2-year-old," says Barney Pell, CEO of Powerset, a startup applying natural language processing to search. Over the next decade, Pell says, search engines will become more sophisticated in their ability to "understand meaning."
Today's search engines are just toddlers, says Pell
Semantic search engines parse language much like an English student does, using dictionaries and thesauri to interpret the meaning of words and link them using common rules of syntax and sentence structure. The sentence "IBM bought Tivoli for $743 million in 1996" includes concepts such as buying, buyer, subject of buy, year of buy, and purchase price.
For now, the process is aided by human beings who apply language rules and define categories to narrow searches, though Hakia's search engine can use language cues to find rough meaning in concepts it doesn't yet understand. "If it was fully automated, we would claim we have invented a human being," Berkan says. Web search engines like Google and Yahoo employ linguists, too, though they're not as far along with semantic search as Hakia or Powerset. Google's search engine can spell check and returns synonyms and variations of words, but it doesn't always answer questions accurately.
The technology of enterprise search company Autonomy powers the Federal Preservation Institute's Historic Preservation Learning Portal, a gateway to documents on preservation rules and methods. The institute uses semantic search to help nonexperts find information. "This allows them to ask in plain language questions that do not have the technical lingo that keywords may have," says Constance Ramirez, the institute's director. For example, a site visitor may ask about the preservation of red tile roofs in California. "It's really fascinating to see all the kinds of things that come back as relevant," says Ramirez.
IBM is working on specialized text analysis in fields such as health care and government. Customers use its OmniFind Analytics search engine to determine nuances like sentiment--whether a document reflects negatively or positively on a subject--and define and relate specialized words, concepts, and proper nouns used inside a company.