How do I check my job's memory, local disk or thread usage?

  • Once a job has started to run, use squeue -j <jobid> to determine the nodes allocated to the job.
  • ssh into each of these nodes using the ssh <node_name> command.
  • Use top -u <username> to see the memory used by each of your processes on that node (see the RES column).
  • If your job is multi-threaded, use top -H (capital h) to see information on each thread.
  • While logged in to each compute node, check the amount of disk space you are using on that node's /scratch disk.

Read this post about tuning your memory requirements and understanding memory error messages.