A large number of scientific applications and software tools are installed on the HPC clusters. Users can install custom software in their /home directories and elsewhere. Software containers can be used via Singularity which is a container platform that is similar to Docker.Environment ModulesSome software is made available via environment modules. Read through the environment modules page and then return here. After reading about environment modules you should be able to:use "module avail" to list available modulesload and unload modulesuse "module show" to see how loading a module changes your environmentA Word on Python ModulesTo get an Anaconda Python implementation (the recommended way to use Python), simply load one of the available anaconda3 modules. For example:$ module load anaconda3/2022.10 $ python --version Python 3.9.13 $ which python /usr/licensed/anaconda3/2022.10/bin/python To see all the packages that are included in Anaconda Python run this command:$ conda listFor more on Anaconda Python, conda and custom Conda environments, see Python on the HPC Clusters.A Word on R ModulesTo start the latest version of R, simply type:$ RTo use an earlier version, see which modules are available and then load one:$ module avail R R/3.6.3 R/4.0.5 R/4.1.3 $ module load R/4.0.5RStudio is available through the MyAdroit and MyDella web portals. For a comprehensive guide on R and RStudio see R on the HPC Clusters.Final Tips on ModulesRemember that no modules are loaded upon connecting to a cluster. Don't put module commands in your .bashrc file. Best practice is to load them each time. By all means set up an alias for your use, but .bashrc is not implicitly loaded for a SLURM job. You are likely to set up a situation where you have tangled modules and not quite sure why your job is failing to behave as expected.If you need software that is not installed or made available through modules, you will most likely have to install the software yourself. The following section provides the needed guidelines.Installing Software Not Available on the ClustersIn general, to install software not available as a module, we recommend that you create a directory such as "/home/<YourNetID>/software" to build and store software. (As a reminder, your /home directory is backed-up.) Note that our systems run Springdale Linux operating systems, so be sure use the Linux-compatible distribution of your software.One exception to this general recommendation is when installing Python and R packages. Python and R packages are installed by default in your /home directory, and therefore don't require that you set up a special folder for them. See more about installing Python or R packages below.Two notes:Be sure to run the checkquota command regularly to make sure you have enough space. Errors found when installing packages can often come down to this.Commands like "sudo yum install" or "sudo apt-get install" will not work.Installing Python Packages on the HPC ClustersSee this guide to installing Python packages with conda or pip on Princeton Research Computing's Python resource page.Installing R Packages on the HPC ClusterSee this guide to installing R packages on Princeton Research Computing's R resource page.Using Software ContainersSoftware containers can be really useful when you need software that may have tricky dependencies. You can pull and run an image (essentially a large file) that contains the software and everything it needs.We do not allow Docker but Singularity can be used. You can still search for and use images from Docker, you just need to use Singularity commands. For example:$ singularity pull docker://hello-world $ singularity run hello-world_latest.sif ... Hello from Docker! This message shows that your installation appears to be working correctly. ... For more information see Containers on the HPC Clusters.Compiling Software, GNU Compiler Collection (GCC)Software that comes in source form must be compiled before it can be installed in your /home directory.One popular tool suite for doing this is the GNU Compiler Collection (GCC) which is composed of compilers, a linker, libraries and tools.To provide a stable environment for building software on our HPC clusters, the default version of GCC is kept the same for years at a time. To see the current version of the GNU C++ compiler, namely g++, run the following command on one of the HPC clusters (e.g., Della):$ g++ --version g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4) At times a newer version of GCC is needed. This is made available by loading one of the latest Red Hat Developer Toolset (rh/devtoolset) modules:$ module load gcc-toolset/10 $ g++ --version g++ (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1) Note that the C and Fortran compilers and related tools are also updated by this method which is important for some software. The relevant tools are gcc, g++, gfortran, make, ld, ar, as, gdb, gprof, gcov and more.When compiling a parallel code that uses the message-passing interface (MPI), you will need to load an MPI module. You can load the Intel compilers and Intel MPI library with:$ module load intel/19.1.1.217 intel-mpi/intel/2019.7 Loading intel/19.1.1.217 Loading requirement: intel-mkl/2020.1 Loading intel-mpi/intel/2019.7 Loading requirement: ucx/1.9.0 $ mpicc --version icc (ICC) 19.1.1.217 20200306 VectorizationModern CPUs can perform more than one operation per cycle using vector execution units. A common example is elementwise vector addition.Vectorized code generated for one processor will not run on another processor unless it supports those instructions. Such an attempt will produce an "illegal instruction" error if the instructions are not supported.Della is composed of three different Intel Xeon microarchitectures:Broadwell (AVX2)Skylake (AVX-512)Cascade Lake (AVX-512)The head node della8.princeton.edu is Cascade Lake. If you compile a code on the head node with AVX-512 instructions then it will fail on the (older) Broadwell nodes. One solution is to exclude the Broadwell nodes in your Slurm script by adding:#SBATCH --constraint=skylakeThe constraint above will cause the job to run on the Skylake nodes and newer (i.e., Cascade Lake).Another solution is make a "fat binary" which is an executable that includes instructions for multiple CPU generations. When using the Intel compiler, for instance, one would use:$ icc -xCORE-AVX2 -axCORE-AVX512 -Ofast -o myexe mycode.cLearn more about the -x option and -ax option.You can see which node type your job ran on with the command "shistory -j". Use the "snodes" command to see all the nodes.Working with Software requiring Graphical User Interfaces (GUIs)To work with graphical applications on our systems, view our guide to working with visualizations and graphical user-interface (GUI) applications.