Commentary

John Foley
Editor, InformationWeek  

Automated Search And The Advil Test

One of the trends identified by Nick Hoover in The Ultimate Search Engine is that search technologies are becoming increasingly automated, executing searches and delivering results without being asked. Startup Kosmix.com is about to formally launch three Web sites based on that concept. It's a big idea, but hard to pull off.

One of the trends identified by Nick Hoover in The Ultimate Search Engine is that search technologies are becoming increasingly automated, executing searches and delivering results without being asked. Startup Kosmix.com is about to formally launch three Web sites based on that concept. It's a big idea, but hard to pull off.Kosmix, founded in 2004, will introduce RightHealth.com (health care), RightAutos.com (cars), and RightTrips.com (travel), all of which are up and running now. Each site is divided into various sections that have been filled with content gathered by Kosmix's "categorization engine." They're examples of what's possible when a search engine combs the Web, then presents results in something other than a stack of URLs.

RightHealth is a highly automated health care portal. From the home page, a user can click on a popular topic -- from "Advil" to "Zoloft" -- or type in a word or phrase, say "freckles." That takes you to a subject-specific page with a dozen or so categories, including images, video, news, and in-depth articles, tools to refine and explore related areas, a list of top Web sites, a blurb and link to Wikipedia, and ad-sponsored links. Tabs at the top of the page get you to research, news, blogs, and communities.


More SMB Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

Each page informs you how many sites were scoured to pull together the information displayed. The page on high blood pressure, for example, was assembled by scanning 121,876 pages on the Web. It's an impressive demonstration of search automation, but the limitations are obvious, too. RightHealth's Advil page, for example, is junked up with metadata ("buy Advil cheap, cheap Advil, buy cheap Advil, online Advil, information Advil, order Advil online, Advil online pharmacy…"), while its "overexertion" page includes information on physical fatigue and a heavy metal band by that name.

That's one of the fundamental problems with automated Web search -- a lot of off-topic debris gets included with the meaningful information. Kosmix claims that its search/categorization technology works for virtually any subject, that it can create an "unofficial home page for every topic on the Web." That may be the direction things are heading, but it's not there yet. Type in "soccer" or "concrete," for example, and you get the familiar stack of URLs, not home pages like those on RightHealth, RightAutos, and RightTrips.

Kosmix (investors include Accel Partners, Lightspeed Venture Partners, and Amazon CEO Jeff Bezos) also has search portals devoted to finance, politics, and video games. The company's auto-generated, on-the-fly home pages are 75% of the way there in six broad areas. What's left to be done, however, may be the toughest part: a better signal-to-noise ratio across more topics.


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links