Autonomous File Recovery

Let users recover a deleted file without admin intervention by aliasing the rm command with mv or by writing your own script that moves the data to another location.

Linux I/O Schedulers

The Linux kernel has several I/O schedulers that can greatly influence performance. We take a quick look at I/O scheduler concepts and the options that exist within Linux.

What to Do with System Data: Think Like a Vegan

What do you do with all of the HPC data you harvested as a lumberjack? You think like a Vegan.

Log Everything

To be a good HPC system administrator for today’s environment, you need to be a lumberjack.


HPC Compilers

If you compile software on an expensive supercomputer, it’s a good idea to select the languages and compilers with particular care. We report on tried-and-proved tools used on SuperMUC, a supercomputer at the Leibniz Supercomputing Center in Germany.

More Small Tools

We look at some additional tools that you might find useful when troubleshooting HPC systems.

Freeing the GPU

Exploring AMD’s ambitious Radeon Open Compute Ecosystem with ROCm senior director Greg Stoner.

Exploring AMD’s Ambitious ROCm Initiative

AMD’s ROCm platform brings new freedom and portability to the GPU space.

It’s the Little Things

Several very sophisticated tools can be used to manage HPC systems, but it’s the little things that make them hum. Here are a few favorites.

Resource Monitoring For Remote Applications

Remora combines profiling and system monitoring to help you get to the root of application problems by revealing its use of resources.