Basic Slurm Commands

These are some of the basic commands for viewing information from the Slurm Workload Manager:

  • sinfo – Quick view of hardware allocated and free
  • sbatch – Submit a job file
  • squeue – View all running jobs
  • sshare – View fairshare information
  • sprio – View queued job’s priority

If you are a PBS Torque user and want to migrate to Slurm, you can find the equivalent examples for PBS and Slurm job scripts in the following table.

Command PBS/Torque Slurm
Job submission qsub job_script sbatch job_script
Job submission qsub -q queue -l nodes=1:ppn=16 -l mem=64g job_script sbatch --partition=queue --nodes=1 --cpus-per-node=16 --mem=64g job_script
Node count -l nodes=count --nodes=1
Cores per node -l ppn=count --cpus-per-node=count
Memory size -l mem=16384 --mem=16g
Wall clock limit -N name  --job-name=name

The sbatch arguments here are the minimal subset required to accurately specify a job on the h2p cluster. Please refer to the output of the man sbatch command or SchedMD's Slurm docs for more options.

SBATCH ARGUMENT DESCRIPTION
--nodes Maximum number of nodes to be used by each Job Step.
--tasks-per-node Specify the number of tasks to be launched per node.
--cpus-per-task Advise the Slurm controller that ensuing job steps will require a certain number of processors per task.
--error File to redirect standard error
--job-name The job name.
--time Define the total time required for the job
The format is days-hh:mm:ss.
--cluster Select the cluster to submit the job to smp, mpi and gpu are the available partition in the H2P
--partition Select the partition to submit the job to. smp, high-mem for smp cluster, opa, legacy for mpi cluster, gtx1080, titan, titanx and k40 for gpu cluster.
--account Charge resources used by this job to specified account. This is only relevant for users who are in multiple Slurm accounts because he/she is in groups that are collaborating.

srun also takes the --nodes,--tasks-per-node and --cpus-per-task arguments to allow each job step to change the utilized resources but they cannot exceed those given to sbatch. The above arguments can be provided in a batch script by preceding them with #SBATCH.

Note

The shebang (#!) line must be present. The shebang line specifies the interpreter for the script, and can call any shell or scripting language available on the cluster. For example, #!/usr/bin/env bash.

Slurm is very explicit in how one requests cores and nodes. While extremely powerful, the three flags,--nodes, --ntasks, and --cpus-per-task can be a bit confusing at first.

--ntasks vs. --cpus-per-task

The term “task” in this context can be thought of as a “process”. Therefore, a multi-process program (e.g. MPI) is comprised of multiple tasks. In Slurm, tasks are requested with the --ntasks flag. A multi-threaded program is comprised of a single task, which can in turn use multiple CPUs. CPUs, for the multithreaded programs, are requested with the --cpus-per-task flag. Individual tasks cannot be split across multiple compute nodes, so requesting a number of CPUs with --cpus-per-task flag will always result in all your CPUs allocated on the same compute node.