Troubleshooting and analyzing VoIP networks

Speechless

MOS Value and R-Factor

Speech quality describes the intelligibility of a human voice during recording and playback by a technical device. An assessment of speech quality is subjective and depends on the given technical means, the recording environment, the transmission path, and the environment in which it is reproduced. The evaluation of this speech quality is specified by the ITU according to the P.800: Methods for subjective determination of transmission quality standard.

The best-known method for assessing speech quality is the mean opinion score (MOS), which describes the subjective perception of a set of candidates with the help of a fixed scale for evaluating the QoS impression. However, you will not want to rely on MOS as the sole criterion for evaluating VoIP connections.

MOS values range from 1 (poor speech quality with no communication possible) to 5 (excellent transmission quality that cannot be distinguished from the original). Table 2 shows the most common codecs and the MOS value determined for each, which corresponds to the best quality a voice codec can obtain.

Table 2

MOS Values of Codecs

Codec MOS
G.711 4.4
G.729 3.92
G.726 3.85
iLBC* 3.8
G.729a 3.7
G.723.1 3.65
G.728 3.61
*iLBC, Internet low bitrate.

The E-model (ITU Rec. G.107) describes a calculation for planning and evaluating the transmission quality of communication networks. This calculation model is used to determine the voice quality available to the user in a connection. The result is an objective evaluation of the transmission quality, taking into account all influencing factors. The three most important parameters of the E-model are:

  • Equipment impairment factor (Ie ): No unit, default value 0, valid range from 0 to 40.
  • Packet loss robustness factor (Bpl ): No unit, default value 1, valid range from 1 to 40.
  • Random packet loss probability (Ppl ): Percentage, default value 0, valid range between 0 and 20.

The E-model is a passive model calculated by a measuring system to determine the speech quality of a VoIP stream. After the parameters have been transferred to the E-model, the measuring system outputs a transmission factor (R-factor). From these values, a prediction is made for speech quality ranging from 0 to 100, which can be mapped on the MOS scale (Table 3).

Table 3

R-Factors and MOS Values

R-Factor Quality MOS
100 Excellent: No effort is needed to understand speech. 5
80 Good: Through attentive listening, speech can be perceived without effort. 4.0
60 Proper: Speech can be grasped with a slight effort. 3.1
50 Moderate: Requires great concentration and effort to understand the transmitted speech. 2.6
0-49 Inadequate: Despite great effort, no communication is possible. <=1

The E-model has established itself as a quasi-standard for the objective assessment of speech quality (in contrast to the subjective MOS measurement method). Because the R-factor can be derived directly from the measured values generated in tests, this value reflects the real traffic parameters. Nevertheless, a correlation with MOS values is possible. The best theoretical R-factor that can be achieved is 100, but this value does not take into account the codecs used. If, for example, a typical G.711 codec is used in a reference environment, a maximum R-factor of approximately 93.2 can be achieved. The following causes contribute to the deterioration of an R-factor:

  • Codec type: Codecs with higher compression rates usually have poorer R-factors.
  • Available bandwidth: Limitations of the transmission bandwidth are determined by the entire transmission system along the transmission path.
  • Delays and jitter: These occur on the network and at the terminal devices and are particularly high in mobile wireless telephones because of the lack of bandwidth on the wireless network.
  • Packet losses: These are attributable to physical network failures, overloaded networks, and coupling components.

An analysis tool must therefore be able to evaluate currently recorded conversations in line with both measurement methods (Figure 2). Because the E-model is related to the packet parameters of the respective RTP session, the measurement reflects the quality of the local network segment.

Figure 2: A detailed investigation of a selected RTP call in the analyzer provides information on the causes of the error.

Recording VoIP Connections

When monitoring the network and subsequently evaluating the signaling and voice packets, it is necessary to record the data selectively. In the analyzer, you need to pay attention to the following parameters for all recorded VoIP connections:

  • Type (SIP, H.323, MGCP, or Skinny)
  • Source and destination
  • Start time and duration
  • Status of the connection (connection setup, connection active, connection terminated)
  • Initiator and counterpart

For any connection you select, the signaling of the connection can be displayed as a directional chart, and the RTP sessions belonging to the connection in question can also be listed. A packet view provides you with a detailed representation (including timestamps), the source and destination addresses, and the packet type of the recorded VoIP packets.

You can use filters to define the individual criteria that a packet must meet to be displayed. It is particularly important to ensure that the VoIP analyzer supports a wide range of different filter options:

  • Connection filter: Displays the information associated with a connection (signaling and RTP data).
  • String filter: Selects only packages that contain the entered string.
  • IP/MAC filter: Displays only packets with the IP/MAC address specified by the VoIP administrator.
  • Protocol filter: Select the different packet types for display in the protocol overview after filtering: All TCP/IP packets , TCP packets only , UDP packets only , and VoIP packets only .

Combining the filters enables even more precise selection of the desired packets.

Flexible filter settings mean that only packets or packet contents that meet the specific requirements of the user are displayed. For example, you can filter out only those packets from a data stream that contain a particular telephone number or display only the corresponding incorrect connection data by telephone number.

Speech Recording

The MOS value and the R-factor can be used to determine the technical quality criteria of a VoIP stream. However, to judge the transmitted language according to subjective user criteria, it might be necessary to reproduce the telephone call in question. For this purpose, a VoIP analyzer is capable of recording and playing back the transmitted speech. Letting users simply listen to the transmitted speech signals opens up the possibility of assessing genuine speech quality.

Additionally, you could decode the recorded RTP session (individually for both directions), and the error pattern lets you evaluate the session quality over the entire session duration.

The ability to display the RTP control protocol (RTCP) information for an RTCP session is also useful for analyzing transmitted data streams. The packet numbers, timestamps, absolute time, intermediate arrival time, RTCP type (i.e., sender report, SR; receiver report, RR), RTCP timestamps, packet and octet count, fraction loss, packet loss, extended high sequence received, interarrival (IA) jitter, last SR, and delay SR are displayed.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Supporting WebRTC in the enterprise
    WebRTC has the potential to bring high-quality, easily developed, and interoperable real-time voice, video, and data communication to all manner of applications in web browsers.
  • Wireshark

    Troubleshoot network problems with this popular protocol analyzer.

  • Transparent SIP communication with NAT
    We show you how to secure transparent IP address transitions through NAT firewalls and gateways for Voice over IP.
  • Segmenting networks with VLANs
    Network virtualization takes very different approaches at the software and hardware levels to divide or group network resources into logical units independent of the physical layer. It is typically a matter of implementing secure strategies. We show the technical underpinnings of VLANs.
  • Managing Bufferbloat
    Bufferbloat impedes TCP/IP traffic and makes life difficult, especially for real-time applications like voice or video.
comments powered by Disqus