By
Thilo Uttendorfer
, By
Valentin Höbel
, By
and Markus Feilner
In emergencies, administrators need to know as quickly as possible whether computers in a private cloud are failing. A simple setup with KVM, Pacemaker, DRBD, and Opsview will help keep watch.
Since the Linux kernel 2.6.20 release in February 2007, the Kernel-Based Virtual Machine, KVM [1], has made much progress in its mission to oust other virtualization solutions from the market. KVM also frequently provides the underpinnings for a virtualization cluster that runs multiple guests in a high-availability environment, thanks to Open Source tools such as Heartbeat [2] and Pacemaker [3].
Very Little Monitoring
Many system administrators still don't monitor the hosts and virtual guests in their clusters. Heartbeat and Pacemaker have built-in alert functions, and many admins are happy with email notification of cluster status. However, a standardized, centralized monitoring system that also covers the virtual guest systems in a private cloud can give admins an impressive operations center for monitoring all of the systems at a glance. Before you think about monitoring, you need to consider a couple of basic things about your cluster setup: A simple combination of Heartbeat and Pacemaker, with virtualization based on KVM, logical volumes, and DRBD [4] may not match the feature scope of VMware, but it won't cost you nearly as much either (Figure 1). With minimal effort, this simple Heartbeat solution provides a system in which a virtual instance is always available.
...
Use Express-Checkout link below to read the full article (PDF).
DRBD9 and DRBD Reactor create a Linux high-availability stack for virtual instances with replicated storage comparable to the classic Corosync and Pacemaker solution.
A big advantage in virtualization is the ability to move systems from one host to another without exposing the user to a long period of downtime. To that end, the hypervisor and storage component need to cooperate.
A big advantage in virtualization is the ability to move systems from one host to another without exposing the user to a long period of downtime. To that end, the hypervisor and storage component need to cooperate.
Sometimes you have to be cruel to be kind: To avoid letting broken nodes in a Pacemaker cluster cause damage, you really need to let cluster nodes kill each other. A couple of details are very important.
Cl uster filesystems such as GFS2 and OCFS2 allow many clients simultaneous access to a storage device. Along with DRBD and Pacemaker, this offers a low-budget option for creating a redundant service – but you need to watch out for a couple of pitfalls.