The InformationWeek -- Blogs
Welcome Guest. | Log In| Register | Membership Benefits

Full Nelson

Topics:   Content Management : Full Nelson : Google : Green IT : Startup City : Startups : Web Tech

  • Email this page E-mail this page
  • Print this page Print this page
  • Bookmark and Share
  • icon

Truevert's Semantic Search


Posted by Fritz Nelson, Jan 21, 2009 01:05 AM

Semantic search is like porn: I'm pretty sure I'll know it when I see it. So when semantic search upstart Truevert came by for a visit, I got all googly (I think I might have even screamed "yahoo"). The Truevert system, powered by OrcaTec's discovery toolkit, is narrowly defined around green, but it's definitely an eye-opening, fresh approach to an elusive problem.


Here is Part 1 of our video discussion with Truevert, including a demonstration of the technology.

Here's Part 2, where we discuss competitors (namely Powerset, now owned by Microsoft) and the nature of other ontological approaches to semantic search.

To be fair, whenever I hear about the semantic Web, I think of a magic, omniscient elf scurrying around squillions of sites, assigning meaning based on the context of, well, everything. So my expectations are high. But frankly, so is my disappointment with traditional search, even if it's changed how most of us view and use the Web. On the one hand, I don't want to become a concatenation expert, but neither do I want Aunt Millie's musings on managing her household budget when I search Google for microfinance. These seem to be my only two choices for better results.

OrcaTec co-founder Herbert Roitblat began by saying that ontology, often thought of as the way toward a semantic Web, is flawed. (He also began by saying that Google's page rank is a popularity contest.) There are lots of ways to categorize and almost no agreement, and the people designing these schemas are not the same people looking for the information.

Even if you were precise in your search terms on a normal search engine, Roitblat summarized, you're really narrowing by exclusion rather than precision. If you enter Green Toilets in Yahoo in an attempt to find more energy-efficient commodes, you would, instead, find avocado or sea-foam green colored toilets.

A true semantic-based approach trusts a context, rather than a categorization. OrcaTec started Truevert with a more vertical approach, namely "green." So everything gets searched through that filter. It uses Yahoo BOSS to gather a Web search, but it then re-ranks the results based on its own language model derived from understanding the association and context of words from 6,000 green-tagged documents in Delicious (which it can do on a mere laptop in less than 15 minutes). Google's terms of service, Roitblat says, don't allow re-ranking of pages the way Truevert does it.

Roitblat says the company chose green because it wanted to start out doing some good, but also because it's a category people can easily understand. The approach can be applied to any vertical using the same approach. You could even apply it to enterprise content management, given that most corporations have their own jargon -- you just train the engine on the documents that you index.

You also can imagine that if you can get more precise in your search results, a decent amount of ad revenue, in the form of better matching, might result.

Truevert competes with a growing list of other new players, like Hakia, Powerset, and Thomson/Reuters Calais. Microsoft recently purchased Powerset. I haven't talked with any of these companies. Yet. I'm sure they'll find me.

« MixedInk Helps Large Groups Find A Common Voice | Main | Microsoft's Rumored Job Cuts To Pay For Branding Failures? »



Sign Up Now
For InformationWeek News Alerts




This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.




 
 

  1. Think Parallel 2010, Five Years of Multicore
  2. It's All In the Strategy, It's All About the Design
  3. How To Do Parallelism Without Getting Egg On Your Face


Join The InformationWeek Group On LinkedIn


  1. Verizon Wireless Details Android 2.1 Update For Droid
  2. Why Microsoft Is The New Apple
  3. Flop Or Not, Nexus One Headed To AT&T
  4. No Copy And Paste For Windows Phone 7


  1. Cybersecurity Bill Trims President's Power
  2. Apple Board Member Jerome B. York Dies
  3. Mobile Apps Get Top Dollar In U.S.
  4. Pegasystems To Buy Chordiant For $161.5 Million
  5. Down To Business: The Most Strategic Vendors, 11-20
  6. Verizon Files Patent Suit Over Cablevision Boxes

 

  Ars Technica
Boing Boing
Channel 9 Forums
CRN Blogs
Dr.Dobb's Portal: Blogs
Engadget
Gizmodo
GrokLaw
  Lifehacker
Schneier on Security
Slashdot
TechCrunch
Techdirt
Techmeme
Valleywag

  DECEMBER 2008
NOVEMBER 2008
OCTOBER 2008
SEPTEMBER 2008
AUGUST 2008
JULY 2008
JUNE 2008
MAY 2008
  APRIL 2008
MARCH 2008
FEBRUARY 2008
JANUARY 2008
DECEMBER 2007
NOVEMBER 2007
OCTOBER 2007
SEPTEMBER 2007