Complete HPC Guide
See a comprehensive guide on SLURM and the Princeton HPC systems: Getting Started with the HPC Clusters
On all of the cluster systems, you run programs by storing the necessary commands in a script file and requesting that the job scheduling program SLURM execute the script file.
A SLURM script file begins with a line identifying the Unix shell to be used by the script. This is usually #!/bin/bash. Next come directives to SLURM beginning with #SBATCH. Every SLURM script should include the - -nodes, - -ntasks-per-node, and - -time directives. The - -nodes directive tells SLURM how many nodes to assign to this job. The - -ntasks-per-node directive tells SLURM how many simultaneous processes will run on each node. The - -time directive tells SLURM how long the job will run.
In the example below, the job asks for one node, one task, and one hour and one minute of running time.
The SLURM directives are followed by the Unix commands needed to run your program. If your program is named, my_app and it’s stored in your home directory, the command would be ./my_app
#!/bin/bash #SBATCH --job-name=slurm-test # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=1 # total number of tasks across all nodes #SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=4G # memory per cpu-core #SBATCH --time=00:01:00 # total run time limit (HH:MM:SS) ./my_app
If the SLURM script file is named my_job.slurm, then you would submit it with the command:
Download a command summary: PDF
More SLURM Examples
See more example Slurm scripts here.
Getting Notifications from a Job
You can request that SLURM send you e-mail when a job begins and ends using the - -mail-type and - -mail-user directives. Just add the following lines to your job with “yourNetID” replaced by your own netid.
#SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-type=fail # send email if job fails #SBATCH --mail-user=<YourNetID>@princeton.edu
Serial and Parallel Jobs
Serial jobs only use a single processor. The previous example shows a typical SLURM serial job. It runs one task using one node and one task per node. For information about running multiple serial tasks in a single job, see Running Serial Jobs.
Parallel jobs use more than one processor at the same time. Two common types of parallel jobs are MPI and OpenMP. MPI jobs run many copies of the same program across many nodes and use the Message Passing Interface (MPI) to coordinate among the copies. More information about running MPI jobs is in Compiling and Running MPI Jobs.
OpenMP parallelizes the loops within a program. OpenMP programs run as multiple “threads” on a single node with each thread using one core. Information about how to run OpenMP in SLURM is in Running OpenMP jobs.
Matlab loops that use the PARFOR statement will operate in a parallel fashion much like OpenMP. See Running Parallel Matlab Jobs for more information.
GPU nodes are available on Tiger, Traverse and Adroit. To use GPUs in a job, you will need an SBATCH statement using the gres option to request that the job be run in the GPU partition and to specify the number of GPUs to allocate. There are four GPUs on each GPU-enabled node.
#!/bin/bash #SBATCH --job-name=poisson # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=4 # total number of tasks across all nodes #SBATCH --cpus-per-task=7 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=4G # memory per cpu-core (4G per cpu-core is default) #SBATCH --gres=gpu:4 # number of gpus per node #SBATCH --time=01:00:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=all # send email on job start, end and fail #SBATCH --mail-user=YourNetID@princeton.edu module purge module load anaconda3 conda activate myenv srun python myscript.py
Note that your code will only be able to utilize a GPU if it has been explicitly written to do so. Furthermore, it will only be able to utilize multiple GPUs if it has written to do so.
Useful SLURM Commands
|sbatch <slurm_script>||Submit a job (e.g., sbatch calc.cmd)|
|squeue||Show jobs in the queue|
|squeue -u <username>||Show jobs in the queue for a specific user (e.g., squeue -u ceisgruber)|
|squeue --start||Report the expected start time for pending jobs|
|squeue -j <jobid>||Show the nodes allocated to a running job|
|scancel <jobid>||Cancel a job (e.g., scancel 2534640)|
|snodes||Show properties of the nodes on a cluster (e.g., maximum memory)|
|sinfo||Show how nodes are being used|
|sshare/sprio||Show the priority assigned to jobs|
|smap/sview||Graphical display of the queues|
|slurmtop||Text-based view of cluster nodes|
|scontrol show config||View default parameter settings|
An Advanced SLURM Script
There are many ways to configure a SLURM job. Here is an advanced script:
#!/bin/bash #SBATCH --job-name=poisson # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=4 # total number of tasks across all nodes #SBATCH --cpus-per-task=7 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=4G # memory per cpu-core (4G per cpu-core is default) #SBATCH --gres=gpu:4 # number of gpus per node #SBATCH --time=01:00:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=all # send email on job start, end and fail #SBATCH --mail-user=YourNetID@princeton.edu pwd; hostname; date env | grep SLURM | sort ulimit -s unlimited ulimit -c unlimited taskset -p $$ module purge module load intel-mpi intel module list srun ./a.out date
The ulimit -s unlimited line makes the stack size the maximum possible. This is important if your code dynamically allocates a large amount of memory. By purging the modules you can be sure nothing has been unintentionally loaded. The module list statement is useful because it writes out the explicit module versions. This important if you later need to know exactly which modules you used. Lastly, all the SLURM environment variables are outputted. One can examine the values to see if they are as expected.
The default values for SLURM for a cluster are found here: /etc/slurm/slurm.conf
To see the run time limits for a cluster, look at: /etc/slurm/job_submit.lua