Welcome Guest. | Log In| Register | Membership Benefits

News In Review

November 17, 1997

Voice Comes To The Desktop

New speech products let users just say the word (or words) to control their office applications

By Edward Cone

T he voice-driven PC, long promised and long delayed, is now coming to the desktop so quickly that one of the early leaders is already plotting its exit strategy.

"We're gearing up for retail," says Charles Skamser, president of Applied Voice Recognition Inc. (AVRI). "Long term, somebody much bigger will take the retail market, so we want to change our company before it happens. We think commodity products for voice will be provided by companies like Microsoft, not this year, but soon."

For the moment, though, publicly traded AVRI is happy to ride the tsunami. The small Houston company-its 1996 revenue was just $275,000-sells VoiceCommander Pro, a product that lets users control a suite of office applications by speaking into a headset microphone. It also lets users schedule appointm ents, send E-mail, and dictate 200 words per minute into a word processor with more than 95% accuracy.

Earlier products were based on discrete speech technology and required users to speak robotically, pausing between words. But recent advances in processing speed and the algorithms used for speech recognition have enabled the introduction of breakthrough products from Dragon Systems and IBM. The AVRI VoiceCommander Pro is built on the IBM ViaVoice speech-recognition engine, to which AVRI adds the office-suite control functions and expanded command capability. "Continuous speech is the true beginning of speech for the desktop," says Jackie Fenn, VP and research director of advanced technologies at Gartner Group Inc.

Fenn forecasts that more users will adopt voice products as the products become integrated into operating environments. "More than 30% of general office workers will use some form of voice recognition by 2001," Fenn says.

Speech-recognition is now poised to go global, reaching beyond the limits of traditional interfaces. IBM ships versions in several European languages and Mandarin Chinese, and will ship a Japanese-language product by year's end. The Japanese and Chinese markets are especially alluring because complicated writing schemes in those countries make keyboard use problematic. "China is a great opportunity," says William "Ozzie" Osborne, general manager of IBM Speech Systems. "We already have their top hardware vendors and Internet service providers using our technology."

The advent of continuous speech is unlikely to kill the discrete-speech market immediately. "Discrete speech products will keep shipping for a while because they can run on slower systems," says William Meisel, publisher of Speech Recognition Update, a newsletter published in Tarzana, Calif.

In fact, sales of discrete dictation products, once confined to specialized markets, have suddenly grown from the tens of thousands of units that were sold when the first consumer products were released by companies like Dragon, Kurzweil, and IBM, to hundreds of thousands today. "We've sold more units in the past four months than in the past seven years," says Roger Matus, VP of marketing for Dragon, in Newton, Mass.

But continuous speech is where growth lies. "I hear that Dragon Naturally Speaking was on a home shopping channel," says newsletter publisher Meisel. "This market could hit a million units by year's end, and tens of millions next year. Consumer outlets really love this stuff, because it encourages upgrading of PCs."

Prices for the products are already competitive with other desktop applications. IBM's ViaVoice continuous speech product lists for $99 and can be found for as little as $80, while discrete speech products start in the $50-to-$60 range.

Early adopters of the speech products tend to be professionals such as attorneys and physicians who do a lot of dictation. "You just can't get an attorney to talk like a robot, so that alone makes continuous speech many times better than previous pro ducts," says AVRI customer Michael Saltsman, director of IS at the Houston law firm of Bayko, Gibson, Carnegie, Hagan, Schoonmaker, & Meyer. "Now we can start to reap the benefits of cost-cutting on things like reduced overtime for transcription."

Another AVRI user is Hugh Barrett, managing director of the Strategic Alliance Group, a Houston consulting firm. "Our overall pro- ductivity has gone up 25% to 30% with the use of discrete speech," he says. "Our switch to continuous speech could add 10% more."

As the speech market comes of age, the players are watching out for Microsoft. While the company has yet to offer a speech product, CEO Bill Gates has announced a commitment to voice. Microsoft also has invested $45 million in the Belgian company Lernout & Hauspie, which sells the Kurzweil voice-recognition products.

What will happen when Microsoft starts building voice-recognition into every operating system it ships? One opinion: very little. "Their operating system incorporates a paint program, but that hasn't hurt the market for design applications," notes Dragon VP Matus. "We're aimed at people who are going to use speech to make them more productive. I don't expect to see a speech-scripting language like the one we offer from Microsoft anytime soon."

IBM's Osborne predicts that add-on products will remain commercially viable in a world with speech-enabled operating systems-as long as these add-ons introduce new features and uses. "Think of the way graphics controllers got standardized on chips, but the next level of functionality was on another board," says Osborne. "The number of applications that do speech will be huge."

Microsoft is keeping its options open by supporting the development of a speech API. "There will always be a place for special needs, for products with incredible accuracy," says Kevin Schofield, senior program manager in the Microsoft Research speech technology group. "If we've got an open platform, others can plug in."

Competition among the big and small vendors is expected to grow. "Right now, AVRI has more functionality as an integrated desktop product than we offer," says IBM's Osborne. "We're headed there, although it is not our intent to squeeze them."

AVRI's Skamser says his company plans to grow by tapping three markets. First, of course, are applications that sit on top of mass-marketed technology from IBM and eventually Microsoft. "Next, we plan to move into several vertical specialities with specific needs, such as medical and legal professionals," he says.

The third and long-term market is custom voice-centered solutions for large companies. "A lot of big companies are going to leap over client-server straight to voice," says Skamser. With a celebrity endorsement from basketball star Hakeem Olajuwon of the Houston Rockets-his Nigerian-accented English is understood by VoiceCommander Pro-and a recently completed round of financing in place, AVRI has big plans for the future-and big challenges, as well.

See related story, " Understanding Voice Is The Key ."


Back to News in Review

Send Us Your Feedback

Top of the Page