Transparent SIP communication with NAT

Number, Please

NAT Traversal and VoIP

One tried and tested means for working around NAT components is manual device configuration, wherein NAT is configured to forward certain data packets to a specific local computer. NAT usually determines forwarding on the basis of the destination port in the data packet and therefore requires a port number (or port range) and the IP address of the local computer for port forwarding. With the help of fixed forwarding by port number, the local computer outside the network can be reached on a fixed port (range). The big advantage of port forwarding is that it is the only NAT traversal technique that actually works for many applications, although it is offset by a number of important disadvantages:

  • Other local computers cannot use this port because of the fixed assignment of a port number to a specific computer.
  • Many applications select the port dynamically, making it difficult to determine beforehand or to select a port from a port range.

The STUN mechanism for transparently routing VoIP streams across NAT systems enables a VoIP endpoint to determine the correct public IP address, provides a mechanism for checking connections between two endpoints, and provides additional mechanisms for maintaining NAT address mappings using a keepalive protocol (Figure 3).

Figure 3: A sample sequence for the STUN mechanism.

An earlier version of STUN described in RFC 3489 [3] – now referred to as "classic STUN" – required a complete revision of the STUN concept on the basis of experience gained in practice. The new STUN (according to RFC 5389 [4]) is now just a mechanism used in conjunction with other specifications (e.g., SIP-OUTBOUND, TURN, and ICE).

The task of a standalone STUN server is to provide the correct transport addresses using the STUN binding function. A STUN server must be able to send and receive messages by the UDP and TCP protocols. A plain vanilla STUN server provides only a partial solution to the problem of correct transfer over NAT gateways. For this reason, a STUN server always collaborates with other components. STUN is more like a tool within a more comprehensive NAT gateway solution. The following STUN uses are currently defined:

  • Interactive connectivity establishment (ICE)
  • Client-oriented SIP connections to external resources (SIP-OUTBOUND)
  • NAT behavior discovery (BEHAVE-NAT)

For VoIP endpoints, STUN provides a mechanism for correctly determining the IP address and the port currently used at the other end of a NAT gateway or router (transition between the private and a public IP address range). In contrast to classic STUN, the information can be transmitted over TCP as well as UDP. The new STUN can also be used to negotiate optional attributes and authentication with VoIP servers.

TURN as a Last Resort

STUN enables a client to determine the correct transport address on which the terminal device can be reached from the public network. If direct communication between the two SIP terminals is not possible and STUN does not provide functional address mapping, the services of a relay computer are used. This mechanism was published in RFC 5766 – "Traversal Using Relays Around NAT (TURN)" [5].

The goal of TURN is to provide the client a publicly accessible address/port tuple even in these situations. The only way to achieve this in all cases is to route the data through a TURN server that can be reached on the public network. For this purpose, a client on the TURN server can request an endpoint on which it will then be publicly accessible. The server will then forward the packets to the client.

Because TURN behaves like port-restricted NAT here, the process does not undermine the security functions of NAT and firewalls. For a client that has defined an endpoint on a server via TURN, it must first send a packet to the clients from which it wants to receive packets. Operating servers on well-known ports behind NAT is therefore not possible. The protocol is based on STUN and shares its message structure and basic mechanisms.

Although TURN always makes it possible to establish a connection, redirecting all traffic through the TURN server places a heavy load on the server. Therefore, TURN should only be considered as a last resort if other methods like STUN do not lead to success.

ICE as a Lubricant

In 2004, the IETF began to develop the ICE technique. For any type of session protocol, ICE ensures trouble-free passage through all types of NAT and firewalls. ICE was designed so that the required addressing functions can be implemented with the SIP protocol and thus also with the Session Description Protocol (SDP). ICE acts as a uniform framework around STUN and TURN. Additionally, ICE supports TCP as well as UDP media sessions.

Instead of only STUN or TURN, an ICE client is able to determine the required addresses with both methods. Both addresses are transmitted to the communication partner along with the local interface addresses in the subsequent SIP call setup message. The elements of the address information contained in the invitation message are known as the "candidates," which are the potential communication endpoints for the SIP agent. When an invitation message reaches the call recipient, the latter also runs the ICE address collection functions and transmits specific addresses in its SIP reply. Both agents then check the possible connections that are implemented by STUN messages from an agent to the other end of the communication path. A check is performed to discover which pair of candidates works. Once a functioning pair of candidates has been found, the media stream begins to flow between the two communication partners.

ICE goes through six steps to establish a connection:

Step 1. The call initiator collects the IP and port addresses of all potential communication candidates before the actual call. The first candidates are sought by the interfaces of the local computer (host). If the host has several interfaces, the agent obtains a candidate from each interface. The candidates of the computer interfaces (including virtual interfaces) are referred to as host candidates. The agent then directly contacts the STUN server on any host interface. The results of these tests are server-reflexive candidates, which translate to the IP and port addresses of the outermost NAT on the path between the agent and the STUN server and is usually the NAT facing the public Internet. Finally, the agent also receives all the candidates from TURN servers. These IP and port addresses reside on the relay servers.

Step 2. Each candidate is prioritized after the agent has collected its candidates. The highest priority defines the candidate to be used. As a rule, relay candidates receive the lowest priority because they have the highest voice delay.

Step 3. According to the identified and prioritized candidates, the agent generates its SIP INVITE request to establish the call. The SDP header is part of the INVITE request, which the caller uses to transmit the connection information required for the call, including the codec, its parameters, and the IP and port addresses to be used. ICE extends SDP by adding some new attributes. The most important of these is the candidate attribute. Because the agent might know more than one possible candidate, it transmits a separate candidate attribute in the SDP header for each possible media stream. The attribute contains the IP and port addresses for the candidate concerned, its priority, and the type of candidate (host, server reflexive, or relay). Additionally, the SDP message contains information for safeguarding the STUN functions.

Step 4. SIP transmits the SIP INVITE message with the corresponding SDP information over the network. If the called agent also supports ICE, the phone will ring. The party being called collects its candidates and generates a preliminary SIP response, which signals to the caller that the SIP request is still being processed. The preliminary response contains an SDP message with the communication partner's candidates.

Step 5. The caller and the called party have exchanged the necessary SDP messages. The agents involved in the call know all candidates for transferring the media streams. Note that certain applications (e.g., videophones) generate more than one media stream. ICE then performs the most important part of its tasks. Each agent pair knows the possible candidates and the corresponding candidates of its peer – the list of possible candidate pairs. Each agent calculates the priority of the candidate pairs (combined priority of the individual candidates), and the candidate couple with the highest priority has the optimal path between the two communication partners.

Step 6. For the final review of the candidate pair, ICE conducts connection checks on the basis of STUN transactions from each agent. The STUN transactions use the IP and port addresses of the selected candidate pairs, which grow in proportion to the square of the number of candidates, and control their bidirectional accessibility. This process makes a parallel review of each candidate pair problematic. ICE checks the candidate pairs sequentially by priority. Every 20ms each agent generates a STUN transaction for the next pair of candidates in the list. If an agent receives a STUN request for a candidate pair, it immediately generates a STUN transaction in the opposite direction, known as a triggered check, accelerating the entire ICE process. After completing the review of a candidate pair, the agent knows that it has found a connection pair for transmitting the media stream correctly. Because the checks are carried out according to the priorities of the candidate pairs, the first functioning candidate pair represents the best possible connection between the two communication partners at the given time. The caller usually confirms the candidate pair found by this process to the other agent, concluding the selection process.

All previous processes (candidate collection and connection tests) take place before the phone rings at the called agent's end; consequently, the connection setup is minimally delayed by ICE. The advantage, however, is that ghost calls and misconnections (i.e., the phone rings, but the called party hears nothing) are eliminated.

If the ICE handshake reveals that the candidate pair differs from the default setting selected in the SDP message (IP and port addresses), the caller initiates an update of the default setting on the basis of a SIP re-INVITE message to synchronize all intermediate SIP elements that do not support ICE but need to know through which addresses the media streams are running.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Supporting WebRTC in the enterprise
    WebRTC has the potential to bring high-quality, easily developed, and interoperable real-time voice, video, and data communication to all manner of applications in web browsers.
  • Spanning Tree Protocol
    Ethernet is so popular because it simply works and is inexpensive. However, the administration side looks a bit more complicated: For the network to run smoothly, the admin might need to make important decisions about the Spanning Tree protocol.
  • Understanding Layer 2 switch port security
    What happens when an intruder with a laptop parks at an empty cubicle and attaches to your local network? If you don't want to find out, it might be time to think about implementing some switch port security.
  • Software-defined networking in OpenStack with the Neutron module
    In classical network settings, software-defined networking (SDN) is a nice add-on, but in clouds, virtual networks are an essential part of the environment. OpenStack integrates SDN technology through the Neutron module.
  • Successful protocol analysis in modern network structures
    Virtual networks and server structures require additional mechanisms to ensure visibility of data streams. We show how to monitor and analyze network functions, even when virtualization is involved.
comments powered by Disqus