Envisioning the future of search, Google's Larry Page recently said, "We want to create the ultimate search engine that can understand anything ... some would call that artificial intelligence. It would give you back the exact right thing instantly."
However, for searches to approximate human reasoning, they need more than keyword matching according to hierarchies of what is popular at the time--the basic technique used by many Internet search engines. Artificial intelligence (AI) can mean many things, and there are myriad technologies aimed at understanding the intent behind a search: machine learning, natural language processing (NLP), latent semantic indexing, ranking algorithms, orthogonal and nonorthogonal term vectors and spectral analysis, to name a few.
"The bottleneck today is the lack of knowledge about the user," says AI expert Ruth Aylett, a professor of mathematics and computer science at Heriot-Watt University in the United Kingdom. What would help, Aylett says, is an understanding of individuals' profiles and the shifting roles they play--parent, employee, hobbyist or political party member.
"We want to proactively push valuable information that is conceptually related to a person's interests," says Stouffer Egan, CEO of Autonomy. To do that, companies like Autonomy, Blinkx and Intellext exploit some degree of language comprehension through desktop searches of information; understanding is based on what people read, write and seek out in the context of enterprise subjects and domains such as financial services, marketing or IT.
While Autonomy, FAST and other enterprise-focused search vendors assert that Google techniques such as page ranking can't decipher intent inside corporate firewalls, the search giant counters that it applies many more sophisticated techniques.
"Page rank is one factor with which we work; others are classification, clustering and synonym finding," says Peter Norvig, Google's director of search quality. Norvig adds that Google is also working with technologies such as statistical machine translation, speech recognition and entity detection. The plan is to leverage what Google "owns" on the Web to learn as many words, and consequent word relations, as possible. That, he says, would enable intuitive, cognitive "conversations" to take place between searcher and search engine.
"We are on our way to learning from more than 1 trillion words procured from public Web pages, where others may have a billion," he says, adding, "there's no data like more data. ... Regardless of how clever the algorithm, the number of words is a critical factor."
Despite its substantial resources, it remains to be seen whether Google can achieve natural thinking and reasoning. In May, the search giant introduced Google Co-op, which lets experts in specialized areas build customized databases in their areas of expertise. "Then you can start defining a hierarchy of intent and a subclass of questions about different topics to publish out to others," Norvig says.
As the use of AI by Google and enterprise- focused rivals grows, searchers remain consistent in one way: They want information based on what they do not yet know. The quest will keep us moving toward the ultimate search, as engines evolve to understand not only the words, but the intent behind the words. --Susana Schwartz
FAST LANE: New York's Finest Get High-Tech Crime Center
New York City's new Real-Time Crime Center is using data-mining and visualization technology from ABM America to find patterns among crimes and offenders. Databases, pictures, video clips and streaming video footage are being examined in hopes of catching criminals quickly.