I had expected most of these kinds of problems to be resolved within a few years as a simple matter of ongoing development and hardware upgrades, but instead these basic design issues are still around and still making it hard for organizations to effectively deploy VoIP solutions.
Worse is that we're seeing the evolution of newer problems, even while older issues remain unresolved. In 1998, we wrote that "our attempts to use features such as 'hold' or 'transfer' across vendors' product lines forced calls to drop." The last time I checked, this still wasn't adequately resolved. Meanwhile, this article by Tony Mancill shows that interoperability problems for the most basic telephony features is just getting nastier:
The DTMF issues in VOIP deployment center around several factors. First, there are several ways of transmitting DTMF digits in an IP-telephony environment. The simplest involves merely sending the DTMF tones along with the voice media stream, and is known as "in-band." This is pretty simple to generate, however detecting these tones on the far-end requires either dedicated DSP hardware or lots of CPU time-slices in a general purpose system. (Keep in mind that, in the general purpose system case, you have to constantly scan the audio stream for these tones.)
Some clever folks got together and decided that there are much better ways of doing this. After all, DTMF has more semantic meaning as an event than as audio content. So several standards--including RFC2833 and SIP INFO--describe how the DTMF key-press can be sent as a single packet or a small set of packets.
A key-press can be sent with far less bandwidth than the corresponding audio media would require. But as with many standards, it's the choices that'll drive you nuts. In some cases, a VOIP switch may be required to detect in-band DTMF, but may generate RFC2833 for an upstream provider, who may terminate the call on the PSTN or route it to a legacy voice mail system, and need to regenerate the audible DTMF tones for that voice mail system, for an E-911 system, etc.
Even if the needed tones are regenerated, however, there can be a serious problem in such a (relatively common) configuration: If the analog telephone adapter (ATA) connected to the VOIP switch generates the in-band tones, the switch has to detect and squelch them from the audio stream because the upstream provider has specified RFC2833 DTMF. However, the switch can't always completely eradicate the in-band DTMF even with very fast-reacting digital signal processors (DSPs). The problem is, by the time you've detected the DTMF, it's too late to recall the very first bit of it, and a short bit of DTMF might have been sent to the far end--maybe 5 ms worth. The RFC2833 event also is sent to the provider, which then generates the DTMF tone for the PSTN.
In the worst case, and believe me, it's not difficult to reproduce, the legacy voice mail system will "hear" the 5-ms blip of DTMF and the RFC2833 tone. Sometimes this makes the voice mail system think that the caller's password is 112233445566. It happens. Without naming any names--it doesn't really matter--the issue is far from solved, even with some the biggest players in the business.
DTMF is about as basic and elemental as it gets, yet the industry is struggling mightily with it.
I don't want to come across as some kind of Chicken Little on this topic, but the point remains that most of the issues that made VoIP hard to deploy and justify back in 1998 are still around today--as well as some new ones--which is unfortunate. And surprising.