A large number of scientific applications and software tools are installed on the Research Computing clusters. Users can install custom software in their /home directories and elsewhere. Software containers can be used via Apptainer which is a container platform for HPC systems.

Environment Modules

Some software is made available via environment modules. Read through the environment modules page and then return here. After reading about environment modules you should be able to:

  • use "module avail" to list available modules
  • load and unload modules
  • use "module show" to see how loading a module changes your environment

A Word on Python Modules

To get an Anaconda Python implementation (the recommended way to use Python), simply load one of the available anaconda3 modules. For example:

$ module load anaconda3/2024.10
$ python --version
Python 3.12.7
$ which python
/usr/licensed/anaconda3/2024.10/bin/python

To see all the packages that are included in Anaconda Python run this command:

$ conda list

For more on Anaconda Python, conda and custom Conda environments, see our Python page.

Starting R

First, see which modules are available and then load one:

$ module avail R
R/3.6.3  R/4.0.5  R/4.1.3  R/4.2.3  R/4.3.1  R/4.4.1
$ module load R/4.4.1

RStudio is available through the MyAdroit and MyDella web portals. For a comprehensive guide on R and RStudio see our R/RStudio page.

Final Tips on Modules

Remember that no modules are loaded upon connecting to a cluster. Don't put module commands in your .bashrc file. Best practice is to load them each time. By all means set up an alias for your use, but .bashrc is not implicitly loaded for a SLURM job. You are likely to set up a situation where you have tangled modules and not quite sure why your job is failing to behave as expected.

If you need software that is not installed or made available through modules, you will most likely have to install the software yourself. The following section provides the needed guidelines.

Installing Software Not Available on the Clusters

In general, to install software not available as a module, we recommend that you create a directory such as "/home/<YourNetID>/software" to build and store software. (As a reminder, your /home directory is backed-up.) Note that our systems run the Red Hat Linux operating system, so be sure use the Linux-compatible distribution of your software.

One exception to this general recommendation is when installing Python and R packages. Python and R packages are installed by default in your /home directory, and therefore don't require that you set up a special folder for them. See more about installing Python or R packages below.

Two notes:

  • Be sure to run the checkquota command regularly to make sure you have enough space. Errors found when installing packages can often come down to this.
  • Commands like "sudo yum install" or "sudo apt-get install" will not work.

Installing Python Packages on the Research Computing Clusters

See this guide to installing Python packages with conda or pip on Princeton Research Computing's Python resource page.

Installing R Packages on the Research Computing Clusters

See this guide to installing R packages on Princeton Research Computing's R resource page.

Using Software Containers

Software containers can be really useful when you need software that may have tricky dependencies. You can pull and run an image (essentially a large file) that contains the software and everything it needs.

We do not allow Docker but Apptainer can be used. You can still search for and use images from Docker, you just need to use Apptainer commands. For example:

$ apptainer pull docker://hello-world
$ apptainer run hello-world_latest.sif
...
Hello from Docker!
This message shows that your installation appears to be working correctly.
...

Compiling Software, GNU Compiler Collection (GCC)

Software that comes in source form must be compiled before it can be installed in your /home directory.

One popular tool suite for doing this is the GNU Compiler Collection (GCC) which is composed of compilers, a linker, libraries and tools.

To provide a stable environment for building software on our HPC clusters, the default version of GCC is kept the same for years at a time. To see the current version of the GNU C++ compiler, namely g++, run the following command on one of the HPC clusters (e.g., Della):

$ g++ --version
g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)

At times a newer version of GCC is needed. This is made available by loading one of the latest Red Hat Developer Toolset (rh/devtoolset) modules:

$ module load gcc-toolset/10
$ g++ --version
g++ (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1)

Note that the C and Fortran compilers and related tools are also updated by this method which is important for some software. The relevant tools are gcc, g++, gfortran, make, ld, ar, as, gdb, gprof, gcov and more.

When compiling a parallel code that uses the message-passing interface (MPI), you will need to load an MPI module.

Vectorization

Modern CPUs can perform more than one operation per cycle using vector processor units (VPUs). A common example is elementwise vector addition.

Vectorized code generated for one processor will not run on another processor unless it supports those instructions. Such an attempt will produce an "illegal instruction" error if the instructions are not supported.

Working with Software requiring Graphical User Interfaces (GUIs)

To work with graphical applications on our systems, view our guide to working with visualizations and graphical user-interface (GUI) applications.