Lead Image © Eric Issel, 123RF.com

Lead Image © Eric Issel, 123RF.com

Managing Bufferbloat

All Puffed Up

Article from ADMIN 57/2020
Bufferbloat impedes TCP/IP traffic and makes life difficult, especially for real-time applications like voice or video.

Data sent on a journey across the Internet often takes different amounts of time to travel the same distance. This delay time, which a packet experiences on the network, comprises:

  • transmission delay, the time required to send the packet over the communication links;
  • processing delay, the time each network element spends processing the packet; and
  • queue delay, the time spent waiting for processing or transmission.

The data paths between communicating endpoints typically consist of many hops with links of different speeds. The lowest bandwidth along the path represents the bottleneck, because the packets cannot reach their destination faster than the time required to transmit a packet at the bottleneck data rate.

In practice, the delay time along the path – the time from the beginning of the transmission of a packet by the sender to the reception of the packet at the destination by the receiver – can be far longer than the time needed to transmit the packet at the bottleneck data rate. To ensure a constant packet flow at maximum speed, you need a sufficient number of packets in transmission to fill the path between the sender and the destination.

Buffers temporarily store the packets in a communication link while it is in use, which requires a corresponding amount of memory in the connecting component. However, the Internet has a design flaw known as bufferbloat that is caused by the incorrect use of data buffers.

TCP/IP Data Throughput

System throughput is the data rate at which the number of packets transmitted from the network to the destination is equal to the number of packets transmitted into the network. If the number of packets in transmission increases, the throughput increases until the packets are sent and received at the bottleneck data rate. If more packets are transmitted, the receive rate will not increase. If the network has large buffers along the path, they are filled with the additional packets and the delay increases.

Logically, a network without buffers has no space to buffer packets waiting to be transmitted. For this reason, the additional packets are deleted. If the transmission rate is increased, the loss rate increases accordingly. To work without intermediate buffers, packet arrivals must be predictable and lossless. In such a case, synchronized timing should ensure that packet losses do not occur. Such networks are complex, expensive, and inflexible. A well-known example of a bufferless network is the analog phone network. The addition of buffers to networks and the packaging of data into packets of variable size led to the development of the Internet.

Data transport on the Internet is grounded on the TCP/IP protocols. The basis of TCP is the idea of line capacity and the knowledge that excessive buffering does not occur along the data path to impede sending a certain volume of data at a time. The early Internet suffered from insufficient buffering. Even under moderate load, a data burst could cause a bottleneck in the transmission of packets (whether one or more connections), and data could be lost because of insufficient bandwidth. The losses eliminated congestion on the network but also led to a drop in throughput. To avoid these problems, sufficiently large buffers were used in the linking components, thus avoiding poor network utilization.

As part of the solution, slow-start and congestion-avoidance algorithms were integrated into the TCP protocol. These additional features created the conditions for the rapid growth of the Internet in the 1990s, as the algorithms maximized throughput, minimized delays, and ensured low losses (Figure 1). The source and target TCPs attempt to determine the line capacity between the two communication partners and then balance the number of packets in transmission. Because network connections are used by many applications simultaneously and conditions on the transmission paths change dynamically, the algorithms continuously examine the network and adjust the number of packets in transmission.

Figure 1: The relationship between throughput and delay for a packet-switched network.

Discovery of Bufferbloat

A few years ago, a Google programmer working at home uploaded a large file to his work server. His children complained to him that his work was negatively affecting their Internet traffic. Of course, the expert wondered how his uploading activities could affect his children's downloads. With no clear answer, he set out to investigate this question.

By experimenting with pings and different load levels on his Internet connection, he discovered that the latency times were often four to 10 times greater than expected. He named this phenomenon "bufferbloat." His conclusion was that some data packets are trapped in excessively large buffers for short periods of time.

The bufferbloat problem was not recognized for a long time for three reasons:

  • Bufferbloat is closely related to the functionality of the TCP protocol and the management of dynamic network buffers. Even in the 21st century, many programmers do not fully understand the management of dynamic buffers across network connections and the components interacting on the transmission path.
  • A widespread misconception is that discarding packets on the Internet is always a problem. However, the truth is that this process is how the TCP protocol functions correctly.
  • Many think that the best way to eliminate poor performance is to increase bandwidth.

Understanding what bufferbloat is and how TCP works is important when it comes to fixing it.

Effects of Bufferbloat

Imagine vehicles driving along an imaginary road. The cars are trying to get from one end to the other as fast as possible, driving almost bumper to bumper at the highest safe speed. The vehicles are, of course, the IP packets, and the road is a network connection. The bandwidth of the connection corresponds to the speed that the cars, including their loads, can travel from one end of the road to the other, and latency is the time it takes each car to get from one end of the road to the other.

One of the problems facing road networks is congestion. If too many cars try to use the road at once, unpredictable things happen (e.g., cars can run off the road or just break down. On the Internet, this is called packet loss, which should be kept to a minimum but cannot be eliminated completely.

One way to tackle a congestion problem would be a metering device of some sort that interrupts the road to the destination, such as ramp meters that control traffic merging onto freeways. When the driver approaches the freeway, a traffic light at the entrance tells them when they can enter the traffic flow. When traffic is low and the lights are not in use, they can simply merge into traffic. The traffic meter controls the timing and speed at which vehicles leave the on-ramp so the number of vehicles on the freeway is maintained at a reasonable level. The traffic meter is continually informed of the traffic congestion and tries to ensure that the new traffic does not have any adverse repercussions.

In this case, only the traffic light manages the merging traffic. The timing of the light has to ensure that the maximum capacity of the freeway is not exceeded, and the arriving cars rely on the fact that there is always enough space on the ramp for them in terms of speed of passage. On the Internet, the freeway ramp is a package buffer. Because network hardware and software developers hate unspecific packet behavior, just as highway builders hate car accidents, networkers have set up many huge buffers all over the network.

On a network, quantities optimize the available bandwidth. In other words, the network maximizes the amount of data that can be transmitted over the network in a constant time. However, these buffers affect latency, so the system has to make one more effort with the cars waiting to merge onto the freeway. It is rush hour and all kinds of vehicles are arriving faster than they can move on. Emergency vehicles, normal cars, delivery vans, and trucks accumulate on the ramp in the order of their arrival, and the traffic meter only deals with the vehicles directly in front of it (i.e., a First In, First Out (FIFO) queuing mechanism). However, the buffer ramp is very large, and the traffic meter does not know which vehicles are important and which are not. Traffic continues to back up and the ramp becomes overloaded.

When this happens on the Internet, the ramp (buffer) adds latency to the connection in question, because packets reach their destinations after long delays and the previously smooth network traffic begins to stutter. The cars try to find alternative routes and, in most cases, fail. In practice, a continuous stream of incoming vehicles tends to bunch together. The size of the bunch depends on the width of the exit. This bunching of traffic inevitably leads to additional accidents. Throughput is reduced and it appears as if the buffer is not even there. Practice has shown that the larger (the more inflated) the buffer, the worse the problems become. In the meantime, some extremely large buffers can be discovered on the Internet. If these buffers were freeway connections, they would be the size of Germany.

Now imagine a huge network of roads and freeways, each with traffic circles that act as buffers at its intersections. The cars on the route are trying to get through as quickly as possible, but they will experience several cascades of delays, and the traffic, which initially runs smoothly, becomes increasingly bunched and chaotic: The congested traffic from upstream buffers clogs the downstream buffers, even though the same volume of traffic would be handled without any problems if it ran smoothly. Such behavior leads to severe and sometimes irretrievable packet losses. As network traffic increases, it is increasingly transmitted in data bursts, and these patterns become more and more chaotic. The individual connections quickly fluctuate back and forth between idle and overload cascades. As a result, delays and packet run times change dramatically and do not follow a predictable pattern.

Packet loss – which should be prevented by integrating buffers – increases dramatically once all buffers are filled because of the random occurrence of thousands of packets causing Internet routers to slow down data transmission. One of the most obvious consequences of this is the latency peaks and thus the slowdown of the most frequently used services (e.g., DNS lookup). Voice over IP services (VoIP, e.g., Skype) and video streaming only work sporadically and can hardly be used.

The way these latency-sensitive services deteriorate illustrates the bufferbloat problem: The perceived speed of the Internet is more a function of latency (time to response) than bandwidth. Thus, bufferbloat changes the features that are most important to users: As the buffers on the network grow larger and more numerous, the effect of bufferbloat becomes greater. However, increasing bandwidth does not eliminate the bufferbloat cascades, and higher bandwidths often make the problem even worse.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus