It’s time to move beyond the basic commands for administrating Linux HPC systems. We look at some very useful tools.

Really Useful Linux Commands for HPC Admins

While managing Linux desktops, laptops, and HPC systems, I learn new commands and tools. As a result, my admin patterns change. In this article I present some commands I have started using more often and have incorporated into my regular routine.

tee

Some scripts, whether for administrative functions or as part of a computational pipeline, write to output files. If I also want to see the output in stdout, I can run tail -f [file] in a different terminal window or use tmux, neither of which is ideal in my case. Lately, I’ve gone back to using tee in my scripts so the output goes to a file as well as stdout.

Classically, you can run scripts and redirect the output to a file:

$ code.script > file.out

All the stdout output is written to file.out, but I can quickly change this line to send the output to a file and to a terminal window:

$ code.script | tee > file.out

With tee, I can stop using tail -f file.out in yet another window. Although tail is fine, tee allows me to see everything as though I am using tail, but I also get to see the output in the terminal window.

watch

Have you run a command to get information about the system, but you need to run it again because the system changes over time? For example, you run uptime for a quick glance of the system load, but you want to track it over time, so you have to keep entering the command or using the Up arrow to rerun the command. A better way is to use watch, which continually runs the command for you.

The watch tool allows you to run user-defined commands at regular intervals and displays the command output to the terminal window. The cool thing is that it clears the window each time it prints the command output, which makes it easy to spot changing values.

The one-line display of critical system information from uptime – current time, up time (how long the system has been running), number of users logged in, and average system load for the past 1, 5, and 15 minutes – only gives you a snapshot of the system when you run the command. To use the command with watch, enter:

$ watch uptime

The display is updated with the most recent output at a default interval of two seconds, as listed at the top of the output.

script

If you have ever wanted to record the output of various commands and scripts to a file, kind of like keeping a record of what’s going on in the terminal window, the Linux command script is your tool. The man page for script says, “… it makes a typescript of everything displayed on your terminal ….”

The simple example

$ script file.out

displays everything to the terminal in which you ran the command and sends it to the file.out. It continues to record everything until you stop the command (Ctrl+D) or reboot the system. Now you can do anything you want with the ASCII (for the most part) file.out. Although you can edit the output file, be watchful for control characters because the text is a “typescript” of the terminal and not pure ASCII. You can always just remove those characters if they are interfering with reading or editing the file.

I use script to create a record of what I did and what the output was when I'm installing something new or building and installing a new package. For example, I use it a great deal when I'm installing new Python packages. I know all about virtual environments in Python and I use them, but I’ve found having a file with a history of what I installed, commands and any tests I ran, and (more importantly) their output, allows me to backtrack easily in case I run into a problem.

touch

The Linux touch tool can be useful but can also be abused. The command changes file timestamps for the files specified on the command line:

$ touch example.out

If the file exists, touch updates the access and modification times of the file. However, it does not change the time when some of the metadata was changed (ctime). If the file does not exist, touch creates an empty file with default options, which you can override.

You can use touch for a variety of purposes. I use it to create an empty file so I can check permissions on a directory. I also use it to reset the modification and access times on a set of files if I want to “re-baseline” them. I occasionally do this when I want to archive a set of files at some fixed point in time.

Some users have discovered that by putting touch in a cron job, they can change the modification and access times of their files, which defeats admin tools that scan for files older than a specific date by making the files appear to have been recently used or modified.

For example, you might create a simple script run in a cron job that compresses all user data that has not been accessed in 10 days. The touch command can be used to prevent this script from compressing the data. I know of users and have had users that do this. The downside of using touch to make it look like files are being actively accessed or modified is that the new touch times will not be accurate. However, if it is more important to keep the tool from compressing files than to keep accurate times on the file, then touch can be a very useful tool.

pigz

When I started writing programs in my undergraduate days, storage space was at an absolute premium. To save space I would quickly erase all object files (*.o) once I created the final binary. If I needed the object file I just recreated it. At that time, CPU cycles were more plentiful compared with storage space. Moreover, once I finished my work, I would erase the binary and the object files.

I then learned to use a tool that could compress the files to save even more space. I could compress my program files a great deal, save my binaries, and save some time when I logged back in to the system. The compression tool absolutely became a habit.

I still have this habit of compressing files when I can, particularly if the files look large or I have finished a project, even though I might have terabytes of storage on my laptop, desktops, and servers. Compressing all files in my user account is a difficult habit to break.

The most common compression tool in Linux is gzip, with the associated uncompressing tool gunzip. The pair are in every Linux distribution and are very easy to use. Compressing a file is as simple as entering:

$ gzip -9 file.py

The command compresses  file.py and adds a .gz to the end of the file name so you know it’s compressed.

I like to use the -9 option, which tells gzip to use maximum compression. Even though -9 does not always give you the smallest file, I use it anyway. To uncompress the file, you use gunzip,

$ gunzip file.py.gz

which changes the file back to its original name by removing the .gz from the end.

You can use gzip with wildcards (*) to compress everything or a set of files that match a pattern. The same is true for gunzip.

With large files or when compressing many files in a directory, gzip takes a long time. Rather than wait for the compression to complete, I use pigz to improve drastically the time to compress a file or a set of files. The “parallel gzip” command probably comes with the Linux distribution you’re using.

The pigz command is very much like gzip, so virtually all the gzip options should work. For example, I still use the -9 option with pigz for maximum compression:

$ pigz -9 file.out

By default, pigz uses all the cores on the system. To limit the number of cores to four, use the -p option:

$ pigz -9 -p4 file.out

To uncompress, add the -d option. You can also uncompress the file with gunzip.

I have taught myself to use pigz instead of gzip, but you can easily create an alias in your .bashrc file to point to pigz when you type gzip.

whereis

The whereis command is not common in non-HPC Linux, but I use it all the time on HPC servers. It allows you to find the binary and source code and the man pages for a specific command. I primarily use it to tell me the path to a binary.

This command comes in handy when using environment modules. Although I can check what modules are loaded with module list, sometimes I want to make 100% sure of the binary I’m using, especially the full path. For example, I use whereis to check the path of the GCC compiler I’m using. This command is really useful when you are writing new modules.

However, whereis can be used for other purposes, as well. If you look at the command’s man page, it provides an example that finds all files in /usr/bin that are not documented in /usr/man/man1 and have no source in /usr/src.

which

A companion command to whereis that I find useful is which. This command returns the path of a command in the current environment. It is particularly useful for working with environment modules because it reports the full path to a command in the current environment. I use whereis and which interchangeably for my needs, but they are different commands producing different kinds of information.