Slurm Job Scheduling System

Common Slurm Commands

Slurm comes with a range of commands for administering, using, and monitoring a Slurm configuration. A number of tutorials detail their use, but to be complete, I will look at a few of the most command commands.

sinfo

The all-purpose command sinfo lets users discover how Slurm is configured:

$ sinfo -s
PARTITION AVAIL  TIMELIMIT   NODES(A/I/O/T)  NODELIST
p100     up   infinite         4/9/3/16  node[212-213,215-218,220-229]

This example lists the status, time limit, node information, and node list of the p100 partition.

sbatch

To submit a batch serial job to Slurm, use the sbatch command:

$ sbatch runscript.sh

For batch jobs, sbatchis one of the most important commands, made powerful by its large number of options.

srun

To run parallel jobs, use srun:

$ srun --pty -p test -t 10 --mem 1000 /bin/bash [script or app]

The same command

$ srun --pty -p test -t 10 --mem 1000 /bin/bash

runs an application script interactively.

scancel

The scancel command allows you to cancel a specific job; for example,

 $ scancel 999999

cancels job 999999. You can find the ID of your job with the squeue command.

squeue

To print a list of jobs in the job queue or for a particular user, use squeue. For example,

$ squeue -u akitzmiller

lists the jobs for a particular user.

sacct

The sacct command displays the accounting data for all jobs and job steps in the Slurm job accounting log or Slurm database, and you can run the command against a specific job number:

$ sacct -j 999999

Summary

A resource manager is one of the most critical pieces of software in HPC. It allows systems and their resources to be shared efficiently, and it is remarkably flexible, allowing the creation of multiple queues according to resource types or generic resources (e.g., GPUs in this article). Slurm also has job accounting by default.

The Slurm resource manager is one of the most common job schedulers in use today for very good reasons, some of which I covered here. Prepare to be “Slurmed.”

« Previous 1 2

Articles

News

Vendors

Whitepapers

Write for Us

About Us