Categorizing audiovisual content by subject matter is a popular strategy, as it once was with text-based Web pages. But on the Internet, keyword search technology has proven to be more effective for finding relevant text than category-based directories.
BBN Technologies, a defense contractor responsible for some of the Internet's technical underpinnings, wants to bring keyword search to spoken content through Podzinger.com, the company's podcast search engine.
Alex Laats, president of BBN's Delta Division, which aims to commercialize BBN's technology, says his company's speech-to-text software creates a text-based index of audio content. Using that index, Podzinger can search podcasts more effectively than it could using only the metadata typically associated with media files.
"If you go into the iPod search tool right now, it's very difficult to find topics of interest because it's only searching the metadata, or RSS information that is associated with the content," Laats explains.
Podzinger promises more relevant search results. "We've invested a lot of energy in making the user experience a good one, because what we're trying to do is make it as easy for a user to search on multimedia content as it is to search on text content," says Laats.
Other companies are trying to do this, too. Desktop search company Blinkx offers software for searching the audio tracks in videos and podcasts. And Podscope, run by media search and services company TVEyes Inc., also deciphers podcasts to make them searchable.
Since the BBN software that translates speech to text isn't perfect, the result doesn't quite qualify as a transcription. Laats calls it a "text index." "Typically, when we say transcription, we expect accuracy in the 99% range," he says, noting that effective search requires accuracy in the 70% to 90% range.
Even so, Podzinger provides a rough transcript in its search results and lets you play selected excerpts that are broken out in the search results list, provided you're using Internet Explorer 5.0 or higher and have the RealPlayer software installed.
Beyond these platform and software constraints, there's more room for improvement. The results can be hit-or-miss, because it's not always possible to determine everything about a podcast from its audio track.
For example, a search for "Vint Cerf podcast" turned up no results on Podzinger. Using Google, those keywords pointed directly to an InformationWeek podcast featuring Google Internet evangelist Vint Cerf.
The addition of more extensive metadata might have made the Vint Cerf podcast easier to find, but Podzinger also bears some responsibility for improving its results though refined technology and by expanding its index. At the moment, Podzinger's index currently includes 18,000 podcasts, about 75% of Apple's inventory.
Given time, Podzinger and its ilk could become as indispensable as Google. At the moment, they remain promising works in progress.