Nexidia Inc., a software company tackling the growing problem of analyzing sounds and speech, has a unique answer to making sense of the growing amount of audio data: speed.
The company has developed a "phonetic search engine" that can process the basic units of speech that comprise words, called phonemes. Instead of trying to analyze whole words, the company's technology can search audio recordings 50 times faster than their actual duration. Traditional speech-to-text systems can process files only four times faster than in real time.
Speed is vital for intelligence applications such as monitoring audio traffic for enemy communications. "In Iraq, our military uses the term 'situational awareness,'" says retired Lt. Gen. Kenneth Minihan, a former director of the National Security Agency and a principal at Paladin Capital Group, which invests in Nexidia. "You always want to know what's going on around you." Nexidia's technology, he says, can pick up on unusual sounds, avoiding the time-consuming and error-prone process of converting speech into text. Speech-to-text systems, for example, can't easily distinguish context--such as whether the word "bomb" means an explosive or a bad movie.
The increase in cell-phone traffic and Internet phone calls helps drive demand, Minihan says. Companies that run call centers, or any business with substantial stores of audio data, could benefit from speed and accuracy improvements.
Analyzing voice-over-IP calls for signs of customer stress is another application, says Mark Finlay, Nexidia's development manager. Analysts, he says, want to "infer behaviors that aren't just related to what's being said, but how it's being said."
Illustration By Brian Stauffer