Logger in HPC System Administration

Store user-specified messages in syslog with logger.

HPC administrators periodically look through the system logs in /var/log, searching for errors or other interesting information. Although it is very nice that the logs are centralized and that admins have tools that use these logs to provide notifications, they are not easy to process and understand. Wouldn't it be nice to be able to put your own information into the system log (/var/log/syslog) so you can parse it for your messages with common tools such as grep? Of course, Linux has a tool for that: logger.

The system log is a central location for all system messages and would be a convenient location for storing your own messages. Any automatic log processing tool will catch these messages, but you can easily set up a simple grep to capture the messages you publish to syslog either manually or in scripts. This capability allows you to use all kinds of logging as part of your system administration, including adding logger entries into the system log. The logger entries can be done manually or incorporated into scripts.

Introduction to Logger

All of the examples in this article are from a rather dated Ubuntu 20.04 system, although logger works the same way on current distributions. The basic syntax to write a simple message to the system log is pretty easy:

$ logger "Just a test"

The entry in syslog will include a time stamp indicating when the message was written to the system log, the name of the system, the user who wrote the message or the tag used to write the message, and, after a colon, the message itself:

...
Jun 28 12:24:55 laytonjb-Nitro-AN515-55 laytonjb: Just a test
...

This capability is really simple but very powerful. You can create the message as part of a script (e.g., a Bash or Python script), so you can construct a message that includes important details. The details could include the name of the script along with its purpose, perhaps the input to script (if any), output from the script (e.g., the status or location within the script (line number)), and perhaps the result of the script (whether or not it succeeded).

If you use logger manually, you can write a message to the system log for anything you want. Perhaps you can write a message just before you start writing code or to check the status of the system. Make the message something useful so you can identify its purpose in the log; more importantly, an entry in syslog with a time stamp tells you when you started. When you finish the project, you can use logger to add another entry saying you completed the project, including any interesting notes that would be useful when coming back to the project.

To find the elapsed time, use grep to look for the messages. For example, if you’re working on a project named Orion-Meatloaf, you just search for that project name:

$ sudo grep -i "project orion-meatloaf" /var/log/syslog

This command shows all entries in the current system log for that project. (I like to include project in the name to differentiate the message from administrative tasks, such as reading other logs (e.g., /var/log/auth.log or /var/log/boot.log) or performing other specific admin tasks – including debugging problems.

Permissions

The user account on my laptop allows me to write to but not read from syslog. You might encounter a user account that won’t allow them to write to syslog. You will have to give that user permission to do so if they need it. However, who you allow to read syslog is up to you. You could give a user sudo permission to syslog if you want, or you could sanitize syslog by removing all but the user’s messages before handing over the data to the user (this task also can be scripted).

Tagging

If logger is used from the command line with the default options, you will get the user’s name with the message. Sometimes, however, the user might want to add a specific tag to the message:

$ logger -t myscript "A second test"

The entry in the system log would look like:

$ sudo tail -n 10 /var/log/syslog
...
Jun 28 14:28:55 laytonjb-Nitro-AN515-55 myscript: A second test
...

Notice that rather than using the user’s name, it uses the tag myscript.

This way of adding specific information to the messages in syslog is particularly helpful with scripts that are in cron jobs or other automated scripts.

Linux pipes can also be used to create complex commands, such as,

$ echo "Output redirect" | logger

to redirect the output from a command to logger.

Logging In

Whenever a user logs in to a system, it is logged in /var/log/auth.log, but you could easily use logger to add a script to a user's account or perhaps create a script that wraps the login binary (/bin/login) that adds messages to syslog. You might be duplicating what /var/log/auth.log does, but you can add whatever extra information you want in the wrapper script and add messages the system log.

Environment Modules

In the HPC world, environment modules allow users to control their environment, such as changing compilers, libraries, and other tools, without having to change the user $PATH$LD_LIBRARY_PATH, and so on. Much of the environment module implementations use scripts when the user changes modules or purges the loaded modules, which means it can all be captured and sent to the system log. The two most common environment module tools, modules and Lmod, capture much of this information, but with logger you can add to the system log whatever information you like, need, or want. Just be cognizant of the size of the system log if you have a large number of users that use environment modules.

Cron Jobs

Cron jobs are the perfect place to use logger. The cron service typically runs scripts that perform one or more tasks at specified time intervals. By default, standard output and standard error are both sent to the email of the user who owns the cron job. In the cron job script, you can add information to the standard output, such as the progress of the script, any issues (errors) that might have arisen, and any final output that ends up in an email. With logger, this data can be sent to the system logs, which, in my opinion, is easier to manipulate than email.

For example, I have a script that walks user home directories, gathering metadata information about their files. It doesn’t access the files, but it accesses the metadata with the stat command. I use this information for a variety of purposes:

  • to watch the increase in total space used by a particular user (uses previously gathered data),
  • to find the oldest file for a particular user (I'm looking for stale data that could be moved to archival storage),
  • to find the largest file for a particular user, and
  • to count the number of small files for that user (I define “small” in the script as less than 4kB).

The script gathers the data and writes it to a CSV file for postprocessing, so I can find the information I want. I could send the raw data to syslog, but that’s a very large amount of data sitting in that file. Instead, I can have the script just find the answers I want and then send a message with logger to syslog, such the following examples:

User X has a file <fully qualified path to file> that is N days old.
The oldest file for User X is NN days.

I could add a tag to the message (e.g., file_walking) so that when I parse the system log, I could search for messages with this tag and then pipe the results through another grep command to output only data from the day of the search. Granted, I could do this in the original script and output to a particular file or send by email, but again, having it in a central location and being able to correlate logger messages with other messages in the system logs is very useful (IMHO).

System Information

Also in the cron job category, but worth mentioning separately, is a script in a cron job that gathers system information such as CPU, memory, network, and storage usage or any other system measurement you want. After gathering that data, you could use logger to send it to the system log, where you now have information as to the status of the system. Normally, this kind of data is not captured and stored anywhere, but you can easily add that capability.

Of course, I don’t think I would capture that information every second because the poor system log would explode. However, you can gather system information at a predefined interval to give you an idea of the system status. For example, maybe you could gather the system load every 15 minutes with uptime to get a very simplified view of the total system load. You could also grab information from the job scheduler (workload manager), such as how many jobs are running, how many jobs are waiting, how many nodes are free, and so on. Again, this is just a snapshot of the system status, but gathering data every 15 minutes means you are gathering 96 data points a day. Over a 30-day period, this is 2,880 data points, which can give you some good insight as to how the system is being used.

Logs Get Huge Quickly

A consideration you need to address before committing to logger is how much data will be added to the system log. The current /var/log/syslog is rotated out to /var/log/syslog.1, so a new syslog can begin. All of the previous syslogs have their extensions incremented by 1, and the syslog with the largest count (e.g., syslog.7) is erased from the system, so /var/log won’t get filled up. Also, any syslog that has a numerical extension is compressed to save space. The use of logger means logs will be bigger, perhaps not by much, so you need to keep that in mind.

If you need the older system logs, which you might, you need to make arrangements for them to be copied to another filesystem. Be sure this secondary filesystem has enough space for as many logs as you need to keep. System logs are text, so you can get a very high compression ratio on them.

Another option is just to parse syslog for whatever information you need (e.g., logger messages), resulting in a much smaller amount of data that can be stored either on the system or in secondary storage. Because the files are small, you can keep them on the primary system much longer before shipping them to secondary storage.

Interaction with Remote Logs

HPC systems will often use remote logging on the compute nodes, which means the logs from all of the compute nodes will be sent to a single server. I have not tested how this interacts with logger because my home lab clusters are so small, but just in case, logger has the ability to send the messages directly to a remote server.

You can take advantage of this feature and send logger messages to a central server but leave all other system messages on the local server. In a way, it would be like filtering syslog and sending the logger messages to a central syslog.

Summary

I’m a big believer in logging lots of information, but if you don’t actively use the data, it’s pointless to have it in logs. The Linux tool logger allows you to create the messages you want and store them in the system log. You can define what is in the message, add tags, and specify when to send the message to syslog.

System administrators can do a great deal with logger to gather information, such as how time is spent working on projects. Scripts can gather just a little bit of information about how the user is functioning. I’m not a fan of logging too much user detail, and I will not gather any data that can be considered private, sensitive, or compromising to someone’s work.

The Linux logger tool is also very useful for sys admin tasks that have scripts. Messages to syslog for these scripts that describe progress, errors, and results, is very simple to add. This additional data can greatly help debugging issues and can track results, as well.