High availability for RESTful services with OpenStack

Switchman

High availability has become a fundamental part of the server room. The cost for a single failure is often so significant that a properly implemented high-availability (HA) system is a bargain in comparison. Linux admins can use the Linux HA stack with Corosync and the Pacemaker cluster manager [1], a comprehensive toolbox for reliably implementing high availability. However, other solutions are also available. For example, if you need to ensure constant availability for HTTP-based services, load balancer systems are always recommended – especially for services based on the REST principle, which currently are very popular.

HA Background

High availability is a system state in which an elementary component can fail without causing extensive downtime. A classic approach to high availability is a failover cluster based on the active/passive principle. The cluster relies on multiple servers, ideally with identical hardware, the same software, and the same configuration. One computer is always active, and the second computer is on standby to take over the application that previously ran on the failed system. One elementary component in a setup of this kind is a cluster manager such as Pacemaker, which takes care of monitoring the servers, and if necessary, restarts the services on the surviving system.

For failover to work, a few system components need to collaborate. Installing the application on both servers with an identical configuration is mandatory. A second important issue is the data store: If the system is a database, for example, it needs to access the same data from both computers. This universal access is made possible by shared storage, in the form of a cluster filesystem like GlusterFS or Ceph, a network filesystem such as NFS, or a replication solution such as DRBD (if you have a two-node cluster). See the box called

...

Use Express-Checkout link below to read the full article (PDF).