© nobeastsofierce, 123RF.com

© nobeastsofierce, 123RF.com

Monitor your nodes with collectl

Collect All

Article from ADMIN 09/2012
By
Effectively monitoring your cluster can be one of the keys to understanding how the hardware and software are interacting. In many cases, this means examining the performance of a single node.

Once you have a cluster operating, typically the next thing you want to do is monitor that cluster. For example, are all the compute nodes operating correctly? Is the network and storage operating correctly, as well as other components?

A second task that many people find themselves performing on cluster nodes is diagnosing or debugging problems. These problems can be related to software or hardware or an interaction of both.

One of the most popular tools for this is called sar [1]. Although sar has been around for a long time and is fairly well known to Linux administrators, it is lacking in some areas. In particular, it lacks the ability to monitor common high-performance computing (HPC) systems such as InfiniBand and Lustre. Additionally, it lacks some of the tools for post-processing data. These two features are fairly important to HPC, so it would be nice to have a tool that could do what sar does – as well as monitor the HPC-specific systems that are important – and easily allow post-processing of the data. One tool that can do this is collectl [2].

Introduction to collectl

Collectl is a Perl code set that grabs as much detail as possible from the /proc filesystem. Although a number of tools do this, collectl has some capabilities [3] that sar does not have. A collection of supporting tools [4] can also help collectl gather and post-process data.

Collectl is easy to install if Perl is already installed on your system. In the HPC world, this means that Perl must be installed on all the compute nodes you want to monitor. You can download the noarch RPM from the collectl website, or you can grab the source tar file – it's up to

...
Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs



Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>
	</a>

<hr>		    
			</div>
		    		</div>

		<div class=