How do I use local scratch?

Local scratch (i.e., /tmp) refers to the local disk physically attached to each compute node on a cluster. This is the fastest storage available to a job while it's running.

However, data stored in /tmp on one compute node cannot be directly read by another compute node. Also, it is necessary for Slurm scripts to copy the output data in /tmp to another location when the job ends. A directory can be created in /tmp and the name of this directory should be passed to the application.

When the application has completed, the Slurm script should copy the information needed from /tmp on each node. This must be done before the job terminates.

Below is an example:

#!/bin/bash
#SBATCH --job-name=usetmp        # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=1               # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=4G         # memory per cpu-core (4G is default)
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)

export ScratchDir="/tmp/myjob"
mkdir -p $ScratchDir
./a.out $ScratchDir

cp -r $ScratchDir /tigress/your_netid/

 

Remember: No /tmp or /scratch filesystem is ever backed up!