Search in Focus: Implementing a Taxonomy - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
News
11/21/2006
11:30 AM
50%
50%

Search in Focus: Implementing a Taxonomy

Search engines don't know the difference between reading glasses and drinking glasses, but a taxonomy puts your query in context. We outline several ways to build taxonomies, ranging from the tough but potentially more accurate approach of building from scratch to the easier but potentially compromised approach of buying a prebuilt taxonomy or using automated clustering software.

Mention the word "taxonomy" and some people will think you mean stuffing dead animals (as in taxidermy). Although the taxonomy may not be well known, taxonomies (or sets of categories) are used to organize quantities of information on the Internet, in portals and in enterprise data repositories. Taxonomies bring context to words, topic areas and search results.

Finding a piece of information within a large collection of data without a taxonomy is like driving in unknown territory without the benefit of a map or road signs: You may eventually stumble upon your destination, but chances are you'll encounter a lot of dead ends and detours first. A taxonomy provides a hierarchical structure of categories, from general to specific. In biology, for instance, dogs are classified under the kingdom Animalia, the phylum Chordata, the class Mammalia, the order Carnivora, the family Canidae, the genus Canis, and the species Canis familiaris.

When combined with metatagging tools, text analytics and search software, enterprise taxonomies support accurate search and guided navigation that could not be achieved with search engines alone. As data volumes increase, so, too, does the need for taxonomy. If you have 100 documents, almost any search technique will work, but if you have a terabyte worth of documents, you need sophisticated search guided by a taxonomy.

We outline several ways to build taxonomies, ranging from the tough but more potentially accurate approach of building from scratch to the easier but potentially compromised approach of buying a prebuilt taxonomy or using automated clustering software. We also examine deployment and ongoing maintenance practices, as well as the role of ontologies, which might come into play in merger and acquisition scenarios.

ASSESS THE NEED

An enterprise taxonomy attempts to classify virtually all information in an organization and brings it under one structure. Despite the many benefits (see "10 Good Reasons To Use a Taxonomy"), building a enterprise-wide taxonomy is easier said than done. Inevitably, each department has its own priorities, terminology and preferred structure for its body of information, so it's hard to get everyone to agree on one core set of categories. "Customers say this takes a long time, and they talk about people in a room yelling at each other," says Fern Halper, a partner at the research and consulting firm Hurwitz & Associates.

In some settings, universal taxonomies are an absolute must. At the Department of Homeland Security and public safety agencies, for example, taxonomies help tie together clues, establish relationships between crucial tidbits of information and spot broader security or safety threats.

Your company may or may not need an organization-wide taxonomy depending on the problems you're trying to solve. "If your application is simply to enable better retrieval of documents or better kinds of communication with structured data in databases, it may not be necessary," says Josh Powers, principal ontologist at search vendor Convera. "But if your goal is better communication throughout the company, you need to come to some agreement."

When it's time to build, there are two approaches: the tough road of trying to create and enforce a taxonomy through task forces, management edicts, training and so on; or the appeasement route, in which you create mappings between differing points of view. If the sales organization looks at the market in a different way than the product management group, you would choose the latter approach, and automated mappings could reconcile the two views with a central taxonomy (perhaps with the aid of an ontology, but more on that later).

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 8
Next
Comment  | 
Print  | 
More Insights
Slideshows
Top-Paying U.S. Cities for Data Scientists and Data Analysts
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/5/2019
Slideshows
10 Strategic Technology Trends for 2020
Jessica Davis, Senior Editor, Enterprise Apps,  11/1/2019
Commentary
Study Proposes 5 Primary Traits of Innovation Leaders
Joao-Pierre S. Ruth, Senior Writer,  11/8/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Slideshows
Flash Poll