November 1, 1999
TechView:
Indexing can be a powerful tool for data mining, especially when applied cleverly. For data-driven applications such as stock-trading tickers, applying rules to existing index technologies can help you find important information buried in heaps of data. A clever system could sift through a huge volume of trading and corporate disclosure information to show companies whose equities behave in a certain way over time or trades made by investors who are active in equities that are unusual for them. By cleverly interrelating data, you can get rich.
But that's data-driven, not knowledge-based. Try to apply this stock-trading analysis to a well-integrated keyword index of all your company's documents--E-mail, reports, presentations, what have you--and you'll quickly descend into Babel. Automating information retrieval, rather than data retrieval, is a technology that must be built on understanding language rather than vocabulary. Language processing is the real enabling technology for automated information systems. Particularly in spoken-language recognition and conversational computing applications, there are a lot of promising developments in teaching computers how to understand language. (For fascinating examples of the state of this art, check out the Spoken Language Systems group at MIT: www.sls.lcs.mit.edu.)
For the next five years, however, you'll have to find an alternative solution. Without automated language processing, you'll have to rely on a key staffer not found in most IT shops--a librarian. Even with today's indexing tools, businesses could better inform their decision-making if someone were truly the custodian of their institutional knowledge. Intranets and application integration can make systems accessible, but they'll never really leverage the knowledge lost in all those words.

ractically all information-retrieval technologies rely on some variation of keyword or category indexing. Smart indexing systems may be able to infer more information than others about a document by looking for the frequency and juxtaposition of words or "learning" which words are most important, but even these systems are crippled by their reliance on vocabulary rather than meaning. It's as if these systems pretend to understand Plato's writings because they have mastered an ancient Greek dictionary. It just won't work.
Back to the Columnist page
Send Us Your Feedback
Top of the Page
Boeing seeking Software Engineer 5 in Anaheim, CA
KForce seeking Inside Sales Associate in San Diego, CA
Amalgamated Bank seeking Chief Information Officer in New York, NY
Apollo College seeking Medical Billing and Coding Instructors in Albuquerque, NM
Allstate seeking Exlusive Agent in Las Vegas, NV
For more great jobs, career-related news, features and services, please visit our Career Center.