An IP-based load balancing solution
Load balancing is one of the major requirements for most of web server farms. These web servers are expected to serve thousands of HTTP and FTP requests per second. Consequently, for such busy web servers, load balancing is no longer a luxury but an essential requirement for both performance and availability. Initially, hardware load balancers were used for this purpose. However, because of the increasing costs of hardware load balancers and with growing maturity in software, the trend now is to use software load balancers.
Linux Virtual Server
Linux has a major advantage over other operating systems in this regard, because most of its commercial distributions (e.g., Red Hat and openSUSE) already provide inherent load balancing capabilities for most network services, such as web, cache, mail, ftp, media, and VoIP. These inherent capabilities are based on layer 4 switching, which allows Linux servers to constitute special kinds of clusters, which are known as LVS clusters.
LVS stands for Linux Virtual Server , which is a highly scalable and highly available server, built on a cluster of real servers, with the load balancer running on the Linux operating system.
The Linux Virtual Server Project implements layer 4 switching in the Linux kernel, which allows TCP and UDP sessions to to be load balanced between multiple real servers. Thus, it provides a way to scale Internet services beyond a single host. HTTP and HTTPS traffic for the web is probably the most common use of LVS clusters.
Note that the architecture of this virtual IP cluster is fully transparent to end users, and the users interact as if it were a single high-performance real server. Different variants of LVS technology have been adopted by many Linux distributions including Debian, Red Hat, and openSUSE.
General Architecture of LVS Clusters
For transparency, scalability, availability, and manageability of the whole system, LVS clusters usually adopt a three-tier architecture, which is illustrated in Figure 1.
This three-tier architecture consists of:
- Load Balancer – This is the front-end machine of the whole cluster system. It balances requests from clients among a set of servers, so that the clients consider that all the services are from a single IP address.
- Server Cluster – This is a farm of real servers running actual network services, such as web, mail, FTP, DNS, or media service.
- Shared Storage – This layer provides a shared storage space for the servers, so that it is easy for the servers to have the same contents and provide the same services. (A common example would be Linux GFS for allowing set of HTTP servers to access a common log filesystem.)
In this kind of architecture, a load balancer is considered the single point of entry of server cluster systems. It runs IPVS (IP Virtual Server ), which implements IP load balancing techniques inside the Linux kernel. With IPVS, all real servers are required to provide the same services and contents; the load balancer forwards a new client request to a server according to the specified scheduling algorithms and the load of each server. No matter which server is selected, the client should get the same result.
Note that the maximum number of real servers can be indefinite and, for most of network services like web (where client requests are generally not high), a linear increase in response time is expected with an increase in the total number of real servers.
Shared storage can be database systems, network file systems, or distributed file systems. In the case where real servers have to write information dynamically to databases, you can expect to have a highly available cluster (active-passive or active-active; like RHEL cluster suite, Oracle RAC, or DB2 parallel server) on this back-end layer.
In the case where data to be written by real servers is static (e.g., web services), you could have either a shared filesystem over the network (e.g., NFS) for smaller implementations or a cluster filesystem (e.g., IBM GPFS or Linux GFS) for bigger implementations. In even smaller environments with fewer security requirements, this back-end layer can be skipped, and real servers can write directly to their local storage.
Layer 4 Switching
IPVS performs a "Layer 4 switching mechanism," which works by multiplexing incoming TCP/IP connections and UDP/IP datagrams to real servers. Packets are received by a Linux load balancer, and a decision is made regarding which real server to foward the packet to. Once this decision is made, subsequent packets to the same connection will be sent to the same real server. Thus, the integrity of the connection is maintained.
The Linux Virtual Server has three different ways of forwarding packets: network address translation (NAT), IP-IP encapsulation (tunneling), and direct routing.
- Network Address Translation (NAT) – In the context of layer 4 switching, packets are received from end users, and the destination port and IP address are changed to that of the chosen real server. Return packets pass through the Linux load balancer at which time the mapping is undone, so the end user sees replies from the expected source.
- Direct Routing – Packets from end users are forwarded directly to the real server. The IP packet is not modified, so the real servers must be configured to accept traffic for the virtual server's IP address. The real server may send replies directly back to the end user. Thus, the load balancer does not need to be in the return path.
- IP-IP Encapsulation (Tunneling) – In the context of layer 4 switching, the behavior is very similar to that of direct routing except that, when packets are forwarded, they are encapsulated in an IP packet, rather than just manipulating the Ethernet frame.
On the Linux load balancer, a virtual service is defined by an IP address, port, and protocol, or a firewall mark. The virtual services are then assigned with a scheduling algorithm that is used to allocate incoming connections to the real servers. In LVS, the schedulers are implemented as separate kernel modules. Thus, new schedulers can be implemented without modifying the core LVS code.
Many different scheduling algorithms are available to suit a variety of needs. The simplest algorithms are round robin and least connected, which are most commonly used in LVS clusters.