Photo by Patrick Fore on Unsplash

Photo by Patrick Fore on Unsplash

Software instead of appliances: load balancers compared

Balancing Act

Article from ADMIN 62/2021
We introduce the most important software load balancers, look at their strengths and weaknesses, and provide recommendations for use scenarios.

For a long time now, load balancers have supported any number of back-end servers. If the load in the setup becomes too high, you just add more servers. In this way, almost any resource bottleneck can be prevented. Understandably, the load balancer is a very important part of almost every web setup – if it doesn't work, the website is down. Appliances from F5, Citrix, Radware, and others were considered the quasi-standard in the load balancer environment for a long time, although powerful software solutions that ran as services on normal Linux servers (e.g., HAProxy or Nginx) were available.

The cloud-native trend is a major factor accelerating the shift toward software-defined load balancing these days, not least because typical appliances are difficult to integrate into cloud-native applications – reason enough for a brief market overview. What tools for software-based load balancing are available to Linux admins? What are their specific strengths and weaknesses? Which software is best suited for which scenario?

Before I get started, however, note that not all of the load balancers presented here offer the same features.

OSI Rears Its Head

As with hardware load balancers, different balancing techniques exist for their software counterparts. In a nutshell, many load balancers primarily differ in terms of the Open Systems Interconnection (OSI) layer on which they operate, and some load balancers offer support for multiple OSI layers. What sounds dull in theory has significant implications in practice.

Most software load balancers for Linux operate on OSI Layer 4, which is the transport layer. They field incoming connections by certain protocols – usually TCP/IP – on one side and route them to their back ends on the other. Load balancing at this layer is agnostic with respect to the protocol being used. Therefore, a Layer 4 load balancer can distribute HTTP connections just as easily as MySQL connections, or indeed any other protocol.

Conversely, Layer 4 also means that protocol-specific options are not available for balancing. For classical load balancing on web servers, for example, many applications expect their session to be sticky, which means that the same clients always end up on the same web server for multiple successive requests according to various parameters (e.g., session IDs).

However, the web server can only guarantee this stickiness if it not only forwards packets at the protocol level, but also analyzes and interprets the data flow. Many load balancers support this behavior, but it is then referred to as Layer 7 (i.e., the application layer) load balancing.

For the choice of load balancer to be used, the question of support for Layer 7 load balancing is of great relevance. Some balancers in this comparison explicitly support Layer 4 only; others also offer protocol support. In this article, I discuss the OSI layers supported by the solutions presented individually for every solution tested.

High Availability

When admins are considering software load balancers, they need to keep the load balancer itself in mind as a potential source of problems, rather than just considering the website's functionality. Appliance vendors draw on this as a welcome opportunity to cash in once again. Load balancer appliances from all manufacturers can be operated in high-availability (HA) setups, but you need at least two of the devices. That the firmware has this functionality doesn't make any difference. In addition to the second device, a license extension is usually required to install the two balancers in an HA cluster.

With self-built load balancers, admins resolve the HA issue themselves by designing for hardware redundancy. However, now that the infamous half-height pizza boxes come with 256GB of RAM and multicore CPUs, the financial outlay is manageable. Moreover, the software load balancers presented here are not particularly greedy when it comes to resources. The software side is a bit more uncomfortable: IP addresses that switch in combination with certain services can be implemented in Linux as a typical HA cluster with Pacemaker.

Classic Choice: HAProxy

The first test candidate in the comparison is a true veteran that most admins will have encountered. HAProxy [1] is one of the oldest solutions in the test field; the first version saw the light of day at the end of 2001. With regard to the OSI model, HAProxy is always based on Layer 4; it implements balancing for any TCP/IP connection. HAProxy is therefore always suitable as a load balancer for MySQL or other databases. Complementary Layer 7 functionality can be used, but only for HTTP – other protocols are excluded.

Because HAProxy has been on the scene for such a long time, it offers several features that often are not needed in many use cases. However, HAProxy has no weaknesses in terms of basic functions. It supports HTTP/2 as well as compressed connections with Gzip, HTTP URL rewrites, and rate limiting to counter distributed denial of service (DDoS) attacks. Because the program has been capable of multithreading for several years, it benefits from modern multicore CPUs, especially those that also support hyperthreading. In any case, a reasonably up-to-date multicore processor is not likely to faze HAProxy.

Secure sockets layer (SSL) and load balancing is a complex issue, because a middleman like HAProxy and SSL encryption are mutually exclusive. Therefore, an established practice is to let the load balancer handle SSL termination instead of the HTTP server. In such cases, of course, the admin can additionally set up SSL encryption between the individual web servers and the load balancer. HAProxy can handle both: The service terminates SSL connections on demand and also talks to the back ends in encrypted form (e.g., when checking their states). HAProxy has extensive options for checking the functionality of its own back ends. If a back end goes offline or the web server running there does not deliver the desired results, HAProxy automatically removes it from the rotation setup.

All told, HAProxy is a reliable load balancer with a huge range of functions, which, however, makes the program's configuration file difficult to understand. That said, any admin in a state-of-the-art data center will probably generate the configuration from the automation setup anyway, so complexity is not a difficulty.

HAProxy connections are available for all common automation systems, either from the vendor or from third-party providers. I also need to mention the commercial version of HAProxy: In addition to support, it comes with a few plugins for additional functions, such as a single sign-on (SSO) module and a real-time dashboard in which various HAProxy metrics can be displayed.

Speaking of metrics data, integrating HAProxy into conventional and state-of-the-art monitoring systems is very simple thanks to previous work by many users and developers. The tool comes with its own status page, which provides a basic overview (Figure 1). All told, HAProxy can be considered a safe choice when it comes to load balancers.

Figure 1: HAProxy offers a status page with basic details on the current usage of the service. © HAProxy

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus