Programs are scheduled to run on Tiger using the sbatch command, a component of Slurm. Your job will be put into the appropriate quality of service, based on the requirements that you describe. See the information about Job Scheduling on the main Tiger page and the sbatch man page for details.
To set up your environment correctly on Tiger, it is highly recommended to use the module facility. This is a utility to correctly set your environment without having to know all the paths to the executables and libraries. In most cases a simple module load intel-mpi command can be issued setting up your environment to use the latest Intel MPI library.
Compiling parallel MPI programs
# loads the intel compilers and MPI library
module load intel intel-mpi
Submitting a JobOnce the executable is compiled, a job script will need to be created for the scheduler. For this machine there are a total of 40 processor cores per node. Here is a sample script which uses 80 processors allocated as 40 processors/node on 2 nodes:
#!/bin/bash #SBATCH --job-name=slurm-test # create a short name for your job #SBATCH --nodes=2 # node count #SBATCH --ntasks-per-node=40 # number of tasks per node #SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=4G # memory per cpu-core (4G per cpu-core is default) #SBATCH --time=01:00:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=all # send email on job start, end and fail #SBATCH --firstname.lastname@example.org module purge module load intel intel-mpi srun <yourmpiprogram>
To allocate a job using the GPUs (up to 4 per node) you will need to add another specifier to the #SBATCH command requesting the number of GPUs as well as sending the job to the GPU partition. Because of the way that the GPU's are laid out on a node, you must allocate both sockets if you require more than 2 GPUs. These nodes have 28 CPU cores each. The following script was written with a Python application in mind:
#!/bin/bash #SBATCH --job-name=poisson # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=4 # total number of tasks across all nodes #SBATCH --cpus-per-task=7 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem-per-cpu=4G # memory per cpu-core (4G per cpu-core is default) #SBATCH --gres=gpu:4 # number of gpus per node #SBATCH --time=01:00:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=all # send email on job start, end and fail #SBATCH --mail-user=YourNetID@princeton.edu module purge module load anaconda3 conda activate myenv srun python myscript.py
Note that your code will only be able to utilize a GPU if it has been explicitly written to do so. Furthermore, it will only be able to utilize multiple GPUs if it has written to do so.
For a set of useful Slurm commands, see this page.