Troubleshooting and analyzing VoIP networks


Cause of Jitter

VoIP packets must reach the recipient at a certain time and ideally at the same intervals. This spacing (intermediate arrival times) is determined by the voice codec. On an IP network, however, run-time fluctuation can occur or different packets can require different transmission times to cross the network. This phenomenon is referred to as jitter and is characterized by a special VoIP problem that can severely affect the quality of a telephone call (jerky communication, poor intelligibility). Jitter is the time between the target and actual arrival times. Ideally, this time difference should be 0ms. All popular IP networks have jitter caused by the transmission components.

VoIP devices use a jitter buffer to compensate for run-time fluctuations by buffering a certain number of packets. Each received packet is temporarily buffered before being forwarded to the receiver (application). The jitter buffer discards any packets that arrive too late. The control function causes additional delays in the jitter buffer. The size of a jitter buffer is either fixed or dynamic, also known as an adaptive jitter buffer, which has the ability to optimize its size to adapt to delays and data loss.

Both fixed and adaptive jitter buffers are capable of automatically adapting to delay changes. For example, if the delay gradually changes by 20ms, some packet loss results in the short term. Over this period, the jitter buffer self-adjusts and thus avoids further data loss. The jitter buffer can thus be regarded as a time window: On one side of the window (the early side), the current data is captured, and the other side of the window (the late side) represents the maximum permissible delay (after which a packet is discarded). However, the jitter buffer can only compensate for delays within certain limits. If the jitter exceeds these limits, the speech signal is interrupted.

The VoIP-only endpoints hardly contribute to jitter. The most common cause of jitter is the struggle between competing VoIP systems for limited transmission resources. For this reason, it is important to control timing with an analyzer and isolate the cause of the jitter. A VoIP analyzer performs a separate jitter calculation for each RTP stream.

Deviating Paths Cause Sequence Errors

The data and voice packets are transmitted independent of each other across the network from the sender to the receiver. Therefore, the packets are subject to individual delays, as well, even if all packets use exactly the same transmission path. However, the cause of sequence errors is usually the routing of packets from sender to receiver over different IP networks and subnets, resulting in different delay times. These path-related delays mean a small number of packets arrive late at the VoIP endpoint, which has a direct effect on speech quality and degrades the received signal. As a rule, the packets are buffered, which allows the endpoint to put the received packets back in the correct order and thus restore the original data stream.

Sequence errors are not a problem in classical data communication. The receiver arranges the data packets in the correct order by TCP sequence number and passes a correct data stream to the higher level application. Because of the real-time conditions of VoIP systems, sequence errors or problems in the transmission of voice over IP networks must be countered with a completely different strategy. Some VoIP systems discard all packets received out of sequence; others dispose of received packets with sequence errors only if their size exceeds the length of the internal buffer. This discarding of packets results in some jitter and, of course, packet loss.

Quality and Combinations of Codecs

Before a VoIP call can be made, the analog audio signal must be converted into a digital signal. Often, only narrow bandwidth is available, so some compression of the signal is also necessary. The aim is to achieve the highest possible voice quality at the lowest possible data rate. An encoding process with a low bit rate and high compression requires serious computing power for encoding and decoding. If this is not available, a procedure with a higher bit rate inevitably needs to be used.

The transmitter captures (quantizes) and encodes the speech signal as a function of the quantity. It is then transmitted over the Internet and decoded by the receiver (i.e., converted into an analog signal for playback). The most important requirement for the coding method used is that the signal must be capable of being encoded and decoded in real time – that is, with minimal delay.

"Codec" is a blended word from "coder" and "decoder" and describes a process for converting analog voice or video information to a digital, often compressed format. The methods for encoding an analog signal are manifold and have developed strongly in recent years. Higher computing capacities and higher bandwidths, especially, have improved the options for codec developers, who nevertheless still struggle with many problems. Codecs need to conserve resources on the one hand and reproduce the original signal as faithfully as possible on the other. Incorrectly transmitted or missing packets need to be replaced without affecting signal quality.

IT managers always need to consider the current network conditions when selecting codecs for the terminal devices (Table 1). For example, the use of a G.711 codec (PCM) on a narrowband connection leads to considerable delays. Discarding packets received late, classified as jitter, affects voice quality. If you only have low bandwidth (<64Kbps net and <85Kbps gross) in a transmission channel, a different codec (with higher compression and thus lower bandwidth requirement) simply has to be used; the G.729 or G.723 codecs are suitable.

Table 1

ITU-T Voice-Encoding Specs

Standard Algorithm* Bit Rate (Kbps)
G.711 PCM 48, 56, 64
G.723.1 MP-MLQ/ACELP 5.3, 6.3
H.728 LD-CELP 16
G.729 CS-ACELP 8
G.729 annex A CS-ACELP 8
G.722 Subband ADPCM 48, 56, 64
G.726 ADPCM 16, 24, 32, 40
G.727 AEDPCM 16, 24, 32, 40
*CELP, code-excited linear prediction; PCM, pulse code modulation; ACELP, algebraic CELP; ADPCM, adaptive differential PCM; AEDPCM, enhanced ADPCM; CS-ACELP, conjugate-structure ACELP; LD-CELP, low-delay CELP; MP-MLQ, multipulse maximum likelihood quantization.

In practice, today's LANs and IP networks offer sufficient bandwidth in the corporate environment, and the optimal codecs will always work. In the case of external calls, bandwidth bottlenecks can occur during transition to the public network. As a rule, the bandwidth of the digital subscriber line (DSL) upstream does not match the downstream bandwidth. If a bandwidth bottleneck occurs in the upstream because of too many parallel VoIP/IP connections, the default codec G.711 must be replaced with a narrower band codec. The G.729a (CS-ACELP), G.723.1 (MP-MLQ), and G.726 (ADPCM) codecs can be used, in this case. Although they reduce voice quality, they require less bandwidth for transport.

Terminal devices negotiate the codecs when establishing a VoIP connection. The system administrator can configure which codecs the terminal devices use. When a call is established, the terminal devices involved send the supported codecs to the other end node in the form of a codec list. If multiple codecs are supported, the preferred codec should always be at the top of the list. The terminal devices check the list for the best match and then use a codec standard for the connection. If the codecs cannot be matched between the terminals, the call will not be established.

Codecs have different identifiers (RTP IDs), which are transported in RTP packets so that the station receiving the RTP stream knows which decoder it needs to use to decode the received signal. The identifiers are exchanged during signaling so that the recipients of the data also use the appropriate decoder. If you use the wrong codec, the call will not play back correctly, and all voice information will be lost.

If two terminals communicating with each other do not use the same codecs and the voice information is nevertheless reproduced correctly, then at least one media gateway must exist somewhere between the two terminals. A media gateway ensures the correct translation of the differently coded signals. This kind of a codec conversion usually affects signal quality. For this reason, as few codec conversions as possible should take place on a VoIP network.

Using the right codec improves voice quality. The analysis of the connection information with the help of a VoIP analyzer reveals problems during the negotiation of codecs, and voice quality can be improved considerably by reconfiguring the active codecs on the endpoints or gateways.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Supporting WebRTC in the enterprise
    WebRTC has the potential to bring high-quality, easily developed, and interoperable real-time voice, video, and data communication to all manner of applications in web browsers.
  • Wireshark

    Troubleshoot network problems with this popular protocol analyzer.

  • Transparent SIP communication with NAT
    We show you how to secure transparent IP address transitions through NAT firewalls and gateways for Voice over IP.
  • Segmenting networks with VLANs
    Network virtualization takes very different approaches at the software and hardware levels to divide or group network resources into logical units independent of the physical layer. It is typically a matter of implementing secure strategies. We show the technical underpinnings of VLANs.
  • Managing Bufferbloat
    Bufferbloat impedes TCP/IP traffic and makes life difficult, especially for real-time applications like voice or video.
comments powered by Disqus