A large number of scientific applications and software tools are installed on the HPC clusters. Users can install custom software in their /home directories and elsewhere. Software containers can be used via Singularity which is a container platform that is similar to Docker.
Some software is made available via environment modules. Read through the environment modules page and then return here. After reading about environment modules you should be able to:
- use "module avail" to list available modules
- load and unload modules
- use "module show" to see how loading a module changes your environment
A Word on Python Modules
To get an Anaconda Python implementation (the recommended way to use Python), simply load one of the available anaconda3 modules. For example:
$ module load anaconda3/2021.11 $ python --version Python 3.9.7 $ which python /usr/licensed/anaconda3/2021.11/bin/python
To see all the packages that are included in Anaconda Python run this command:
$ conda list
For more on Anaconda Python, conda and custom Conda environments, see Python on the HPC Clusters.
A Word on R Modules
To start the latest version of R, simply type:
To use an earlier version, see which modules are available and then load one:
$ module avail R R/3.6.3 R/4.0.5 $ module load R/4.0.5
Final Tips on Modules
Remember that no modules are loaded upon connecting to a cluster. Don't put module commands in your .bashrc file. Best practice is to load them each time. By all means set up an alias for your use, but .bashrc is not implicitly loaded for a SLURM job. You are likely to set up a situation where you have tangled modules and not quite sure why your job is failing to behave as expected.
If you need software that is not installed or made available through modules, you will most likely have to install the software yourself. The following section provides the needed guidelines.
In general, to install software not available as a module, we recommend that you create a directory such as "/home/<YourNetID>/software" to build and store software. (As a reminder, your /home directory is backed-up.)
One exception to this general recommendation is when installing Python and R packages. Python and R packages are installed by default in your /home directory, and therefore don't require that you set up a special folder for them. See more about installing Python or R packages below.
- Be sure to run the checkquota command regularly to make sure you have enough space. Errors found when installing packages can often come down to this.
- Commands like "sudo yum install" or "sudo apt-get install" will not work.
Installing Python Packages on the HPC Clusters
See this guide to installing Python packages with conda or pip on Princeton Research Computing's Python resource page.
Installing R Packages on the HPC Cluster
On some clusters it is important to be aware of the need to update the compiler using an environment module before installing certain R packages. This is mentioned in the Compiling Software, GNU Compiler Collection (GCC) section below, and is described in more detail in the linked guide to installing R packages.
Using Software Containers
Software containers can be really useful when you need software that may have tricky dependencies. You can pull and run an image (essentially a large file) that contains the software and everything it needs.
$ singularity pull docker://hello-world $ singularity run hello-world_latest.sif ... Hello from Docker! This message shows that your installation appears to be working correctly. ...
For more information see Containers on the HPC Clusters.
Compiling Software, GNU Compiler Collection (GCC)
Software that comes in source form must be compiled before it can be installed in your /home directory.
One popular tool suite for doing this is the GNU Compiler Collection (GCC) which is composed of compilers, a linker, libraries and tools.
To provide a stable environment for building software on our HPC clusters, the default version of GCC is kept the same for years at a time. To see the current version of the GNU C++ compiler, namely g++, run the following command on one of the HPC clusters (e.g., Della):
$ g++ --version g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)
At times a newer version of GCC is needed. This is made available by loading one of the latest Red Hat Developer Toolset (rh/devtoolset) modules:
$ module load gcc-toolset/10 $ g++ --version g++ (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1)
Note that the C and Fortran compilers and related tools are also updated by this method which is important for some software. The relevant tools are gcc, g++, gfortran, make, ld, ar, as, gdb, gprof, gcov and more.
When compiling a parallel code that uses the message-passing interface (MPI), you will need to load an MPI module. You can load the Intel compilers and Intel MPI library with:
$ module load intel/184.108.40.206 intel-mpi/intel/2019.7 Loading intel/220.127.116.11 Loading requirement: intel-mkl/2020.1 Loading intel-mpi/intel/2019.7 Loading requirement: ucx/1.9.0 $ mpicc --version icc (ICC) 18.104.22.168 20200306
Modern CPUs can perform more than one operation per cycle using vector execution units. A common example is elementwise vector addition.
Vectorized code generated for one processor will not run on another processor unless it supports those instructions. Such an attempt will produce an "illegal instruction" error if the instructions are not supported.
Della is composed of three different Intel Xeon microarchitectures:
- Broadwell (AVX2)
- Skylake (AVX-512)
- Cascade Lake (AVX-512)
The head node della8.princeton.edu is Cascade Lake. If you compile a code on the head node with AVX-512 instructions then it will fail on the (older) Broadwell nodes. One solution is to exclude the Broadwell nodes in your Slurm script by adding:
The constraint above will cause the job to run on the Skylake nodes and newer (i.e., Cascade Lake).
Another solution is make a "fat binary" which is an executable that includes instructions for multiple CPU generations. When using the Intel compiler, for instance, one would use:
$ icc -xCORE-AVX2 -axCORE-AVX512 -Ofast -o myexe mycode.c
You can see which node type your job ran on with the command "shistory -j". Use the "snodes" command to see all the nodes.
TigerCPU vs. TigerGPU
The processor on tigercpu.princeton.edu supports AVX512 instructions while those on tigergpu.princeton.edu can only do AVX2.
Be sure to compile codes for tigercpu on tigercpu.princeton.edu and compile codes for tigergpu on tigergpu.princeton.edu.
If you ssh to tiger.princeton.edu then you will land on tigercpu.princeton.edu.