Guide to the TechWeb Network


The InformationWeek -- Blogs


Topics:   Web Tech

  • Email this page E-mail this page
  • |  Print this page Print this page
  • |   Bookmark and Share

The Enterprise Future Of Semantic Search


Posted by J. Nicholas Hoover, May 13, 2008 12:04 PM

Powerset launched a tool to search Wikipedia and open source database Freebase Monday, but the technology that powers the search startup could wind up at home in a corporate setting.

Powerset specializes in what's become known as natural language or semantic search. Rather than relying exclusively or primarily only on linking algorithms, as Google and other major Web search engines do, Powerset also uses an "ontology" of syntax, grammar, and sentence structure and, to a lesser extent, thesauri, in an attempt to pull meaning from queries and Web pages -- or in the case of businesses, eventually documents and files. "Search is only a small part of the engine here," Powerset CTO Barney Pell said in an interview. "Really, it's a content understanding engine."

Other search companies are getting in on the semantic game as well, and are looking toward businesses as potential customers.

• Startup Hakia has begun targeting users searching for legal, medical, and financial information, and licensed its technology to a startup that summarizes information for law enforcement, government agencies, and pharmaceutical companies.
Semantra is a new enterprise search start-up focused exclusively on semantic search, and does what it calls "conversational analytics" for Microsoft CRM and all major relational databases.
Q-Go is a Dutch company that does natural language in several verticals. Its customers include DHL, KLM, and Deutsche Telekom.
Cognition similarly does vertical search with a natural language angle, and has its own Wikipedia search.
Inquira uses natural language search in its customer service app to help support staff answer broad or unclear questions, and counts among its customers Honda, SunTrust, and Honeywell.
• Astute Solutions' RealDialog uses natural language processing for Web self service support
• Even the major Web and corporate search companies, including Google, IBM, and Microsoft, have their own semantic search efforts under way.

Powerset can distill a Wikipedia article into key concepts by picking out verb-noun relationships. A search for David Lee Roth, for example, brings back the information that he was the lead singer for Van Halen, later left the band, and at some point released an album called Skyscraper. Since the engine also uses Freebase -- it could do the same with, say, a drug catalog or a customer database if they were structured properly -- the results page also returns a short bio for Roth.

Since semantic search engines can pick at meaning, they also help when the searcher has a broad question that couldn't be answered if they didn't understand concepts or words. The classic Powerset example is a search for "politicians killed by disease," which brings back the information that, for example, Benjamin Harrison died of a combination of the flu and pneumonia. Imagine querying a business search engine for something like "customers in a hurricane zone" or "employees in senior management."

There are many business scenarios where information is highly specialized, rare, and often placed into odd structures. There's a ton of specialized data in businesses, but Powerset doesn't have to know the meaning of a word to understand related concepts, so it and search engines like it could be of use in the corporate world. Semantra, for example, might return a CRM query for "which retail accounts in Baltimore have sales opportunities of more than $50,000 to women before next week?" or something like it.

There's no dictionary defining an iPhone for Powerset, or that it is made by Apple, or even that it's a device, yet a search distills facts about it, showing, for example that it supports Bluetooth, a fact that's pulled from the Apple Wikipedia entry, not the iPhone's entry. Additionally, semantic technologies can also help return results when information is rare or poorly labeled, because it doesn't rely on links.

That's all the good news. The bad news: Powerset's far from ready for prime time, and neither are some of the other semantic search engines. Google still beats Powerset hands down on a number of queries, and other semantic search engines only have limited rule bases or dictionaries backing them up. Try before you buy is the operative here.

« Closing The Open Source ASP Loophole | Main | How Will Microsoft Handle Ultra-Low-Cost PCs? »



Tomorrow's CIO: Do you have what it takes?
Find out at the 2008 InformationWeek 500 Conference
Sept. 14-16, St. Regis Resort, Monarch Beach, Calif.


Sign up now for the weekly InformationWeek Blog Newsletter.


This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.






  1. Google Gets Chatty, Creates New iPhone Instant Messaging Program
  2. Powerset Grab Shows Microsoft's Commitment To Search
  3. Why Are So Many People Freaking Out About The Unlocked iPhone's $700 Price Tag?
  4. Vint Cerf Says Government Needs To Encourage Internet Competition
  5. An iPhone With A Slide-Out QWERTY?


  1. Apple Drops Price Of MacBook Air
  2. Google Employees Warned Of Data Breach At Benefits Company
  3. 'Containers' Out Perform Virtualization For KV Pharmaceuticals
  4. Mobile Music A $7.3 Billion Industry By 2011
  5. IBM Develops Audio Masking Technology To Protect Call Center Recordings
  6. IBM Back On Top Of Server Market

 
 

  Ars Technica
Boing Boing
Channel 9 Forums
CRN Blogs
Dr.Dobb's Portal: Blogs
Engadget
Gizmodo
GrokLaw
  Lifehacker
Schneier on Security
Slashdot
TechCrunch
Techdirt
Techmeme
Valleywag

  FEBRUARY 2008
JANUARY 2008
DECEMBER 2007
NOVEMBER 2007
OCTOBER 2007
SEPTEMBER 2007
AUGUST 2007
JULY 2007
  JUNE 2007
MAY 2007
APRIL 2007
MARCH 2007
FEBRUARY 2007
JANUARY 2007
DECEMBER 2006
NOVEMBER 2006