Photo by Kier… in Sight on Unsplash

Photo by Kier… in Sight on Unsplash

Log analysis in high-performance computing

State of the Cluster

Article from ADMIN 72/2022
Log analysis can be used to great effect in HPC systems. We present an overview of the current log analysis technologies.

Gathering logs from distributed systems for manual searching is a typical task performed in high-performance computing (HPC) [1]. Log analysis is important for cybersecurity, understanding HPC cluster behavior, and event and trend analysis. In this article, I address the state of the art in log analysis and how it can be applied to HPC.


Log analysis can produce information through a variety of functions and technologies, including:

  • ingestion
  • centralization
  • normalization
  • classification and logging
  • pattern recognition
  • correlation analysis
  • monitoring and alerts
  • artificial ignorance
  • reporting

Logs are great for checking the health of a set of systems and can be used to locate obvious problems, such as kernel modules not loading. They can also be used to find attempts to break into systems through various means, including shared credentials. However, these examples do not really take advantage of all the information contained in logs: Log analysis can be used to improve system administration skills.

When analyzing or just watching logs over a period of time, you can get a feel for the rhythm of your systems; for example: When do people log in and out of the system? What kernel modules are loaded? What, if any, errors occur (and when)? The answers to these questions allow you to recognize when things don't seem quite right with the systems (events) that "normal" log analysis might miss. A great question is: Why does user X have a new version of an application? Normal log analysis would not care about this query, but perhaps the user needed a new version and could indicate that others might also need the newer version, prompting you to build and make it available to all.

Developing an intuition of how a system

Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.