PETSc is a popular suite of data structures and routines for the scalable solution of scientific applications. This webpage provides a starting point for building PETSc on the HPC clusters.
PETSc is highly configurable so it is not pre-installed on the HPC clusters. Users must build their own version. Read the PETSc installation page before building the software. Below we provide build instructions for specific configurations. You will need to modify these for your own needs.
To see all the possible options do the following:
$ git clone -b release https://gitlab.com/petsc/petsc.git petsc $ cd petsc && git pull $ ./configure --help
To search on a specific keyword such as “blas”:
$ ./configure --help | grep -i blas
Stellar
Below is a sample build procedure on stellar-intel:
ssh <YourNetID>@stellar-intel.princeton.edu cd software # or another directory of your choosing git clone -b release https://gitlab.com/petsc/petsc.git petsc cd petsc module purge module load cmake/3.19.7 module load intel/2021.1.2 module load intel-mpi/intel/2021.3.1 module load fftw/intel-2021.1/intel-mpi/3.3.9 module load hdf5/intel-2021.1/intel-mpi/1.10.6 unset I_MPI_HYDRA_BOOTSTRAP # for testing only, do not include in Slurm script unset I_MPI_PMI_LIBRARY # for testing only, do not include in Slurm script OPTFLAGS="" ./configure --with-clean --with-ssl=0 --with-c++-support --with-debugging=0 \ --with-shared-libraries=0 --download-metis --download-parmetis --download-superlu_dist \ --download-superlu --download-mumps --with-fftw-dir=$FFTW3DIR --with-hdf5-dir=$HDF5DIR \ --download-blacs --download-fblaslapack --download-zoltan --with-mpi-compilers=1 \ --with-scalar-type=real PETSC_ARCH=real-intel2021.1.2-intelmpi --download-scalapack \ --with-mpi-dir=$I_MPI_ROOT make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=real-intel2021.1.2-intelmpi all make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=real-intel2021.1.2-intelmpi check
The command "unset I_MPI_HYDRA_BOOTSTRAP" prevents errors arising from the PETSc build system trying to run batch jobs. On Stellar you should also do "unset I_MPI_PMI_LIBRARY". These commands are necessary since Intel MPI was built for Slurm and Slurm is not used on the login nodes. Do not unset these variables outside of the install procedure. That is, when running jobs simply load the modules and do not unset.
TigerCPU
Below is an example installation procedure on TigerCPU:
$ ssh <YourNetID>@tigercpu.princeton.edu $ cd software # or another directory of your choosing $ git clone -b release https://gitlab.com/petsc/petsc.git petsc $ cd petsc $ module load intel/19.0/64/19.0.5.281 intel-mpi/intel/2019.5/64 $ module load cmake/3.x rh/devtoolset/7 $ OPTFLAGS="-Ofast -xHost -DNDEBUG" $ unset I_MPI_HYDRA_BOOTSTRAP # for testing only, do not include in Slurm script $ ./configure PETSC_ARCH=intel-mkl-complex --with-blaslapack-dir=$MKLROOT \ --with-scalapack-include=$MKLROOT/include \ --with-scalapack-lib="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" \ --with-debugging=0 --COPTFLAGS="$OPTFLAGS" --CXXOPTFLAGS="$OPTFLAGS" --FOPTFLAGS="$OPTFLAGS" \ --with-scalar-type=complex $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=intel-mkl-complex all $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=intel-mkl-complex check
The procedure above uses the Intel compilers and Intel MPI library. By loading the cmake and rh modules, the PETSc build system can learn more about the host machine. Multiple warnings about data types will appear if these two modules are not loaded. In addition to taking advantage of compiler optimizations and vectorization, the procedure above builds PETSc against the Intel Math Kernel Library for BLAS, LAPACK and ScaLAPACK which gives a performance gain over the reference implementations of netlib.org. The command "unset I_MPI_HYDRA_BOOTSTRAP" prevents errors arising from the PETSc build system trying to run batch jobs.
Della
Della is composed of different generations of Intel processors. The example below makes a so-called fat binary which allows it run optimally on both AVX2 and AVX-512:
$ ssh <YourNetID>@della8.princeton.edu $ cd software $ git clone -b release https://gitlab.com/petsc/petsc.git petsc $ cd petsc $ module purge $ module load cmake/3.18.2 $ module load intel/19.1.1.217 intel-mpi/intel/2019.7 $ unset I_MPI_HYDRA_BOOTSTRAP # for testing only, do not include in Slurm script $ unset I_MPI_PMI_LIBRARY # for testing only, do not include in Slurm script $ OPTFLAGS="-Ofast -xCORE-AVX2 -axCORE-AVX512" $ ./configure PETSC_ARCH=intel-mkl-double-complex --with-blaslapack-dir=$MKLROOT \ --with-scalapack-include=$MKLROOT/include \ --with-scalapack-lib="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" \ --download-mumps --download-superlu_dist --with-cuda=0 --download-hypre --with-debugging=0 \ COPTFLAGS="$OPTFLAGS" CXXOPTFLAGS="$OPTFLAGS" FOPTFLAGS="$OPTFLAGS" \ --with-scalar-type=complex --with-precision=double $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=intel-mkl-double-complex all $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=intel-mkl-double-complex check
Note that we link against the Intel MKL library for BLAS, LAPACK and ScaLAPACK. This will give improved performance over downloading the versions available through PETSc (i.e., --download-fblaslapack and --download-scalapack). We also use optimization flags that take full advantage of each of the microarchitectures of Della.
Traverse
Below is an example build for the Traverse cluster:
ssh <YourNetID>@traverse.princeton.edu cd software git clone -b release https://gitlab.com/petsc/petsc.git petsc cd petsc module purge module load cmake/3.19.7 module load openmpi/gcc/4.1.1/64 module load cudatoolkit/11.4 OPTFLAGS="-Ofast -mcpu=power9 -mtune=power9 -DNDEBUG" CUDAFLAGS="-O3 --use_fast_math -arch=sm_70" ./configure PETSC_ARCH=openmpi-power \ --download-fblaslapack --with-debugging=0 \ --with-cuda=1 --CUDAOPTFLAGS="$CUDAFLAGS" --with-cuda-arch=70 \ --with-cxx-dialect=c++14 --with-cuda-dialect=c++14 \ --COPTFLAGS="$OPTFLAGS" --CXXOPTFLAGS="$OPTFLAGS" --FOPTFLAGS="$OPTFLAGS" \ --with-scalar-type=complex --with-batch=1 make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check
The above procedure could possibly be improved for performance by linking against ESSL. One could also use OpenBLAS.
Large Indices
In some cases you may need to build a version with 64-bit integers. The following builds PETSc against the 64-bit Intel MKL BLAS/LAPACK with multithreading on TigerCPU:
module load intel/19.0/64/19.0.5.281 intel-mpi/intel/2019.5/64 rh/devtoolset/6 OPTFLAGS="-Ofast -xHost -mtune=skylake-avx512 -DNDEBUG" git clone -b maint https://gitlab.com/petsc/petsc.git petsc cd petsc ./configure PETSC_ARCH=arch-linux2-64 --with-debugging=0 \ --COPTFLAGS='-O3 -xHost -DMKL_ILP64' --CXXOPTFLAGS='-O3 -xHost -DMKL_ILP64' \ --with-blaslapack-include="${MKLROOT}/include" \ --with-blaslapack-lib="-L${MKLROOT}/lib/intel64 -lmkl_intel_ilp64 \ -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm -ldl"
If you encounter an error like "TESTING: configureMPIEXEC from config.packages.MPI(config/BuildSystem/config/packages/MPI.py:174)" or "Runaway process exceeded time limit" then try running this command before running the configure script: unset I_MPI_HYDRA_BOOTSTRAP. In some cases these errors can be addressed by adding --with-mpiexec="srun -N 1 -n 1 -t 1" or --with-batch to the configure line.
CUDA
Below is an example of building PETSc with CUDA on TigerGPU:
ssh <YourNetID>@tigergpu.princeton.edu git clone -b release https://gitlab.com/petsc/petsc.git petsc cd petsc module load cmake/3.x module load rh/devtoolset/8 module load openmpi/gcc/3.1.5/64 module load cudatoolkit/11.3 OPTFLAGS="-O3 -march=native" ./configure PETSC_ARCH=arch-gcc-openmpi-cuda-release --with-debugging=0 \ --with-cxx-dialect=c++14 --with-cuda-dialect=c++14 \ --COPTFLAGS="$OPTFLAGS" --CXXOPTFLAGS="$OPTFLAGS" --FOPTFLAGS="$OPTFLAGS" \ --CUDAOPTFLAGS="-O3 --use_fast_math -arch=sm_60" --with-scalar-type=complex --with-fortran-kernels=1 \ --with-fortran-interface=1 --with-cuda=1 --download-slepc=yes make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=arch-gcc-openmpi-cuda-release all
Running make check will fail because there is no GPU on the head node:
make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=arch-gcc-openmpi-cuda-release check
See PETSc installation notes on how to use a GPU.
Running the Examples
When the Slurm job scheduler makes it difficult to run the tests using "make check" then one can build using "--with-batch=1". To then run the tests manually one can use:
$ PETSC_DIR=$HOME/software/petsc $ PETSC_ARCH=openmpi-power # load appropriate modules (e.g., nvhpc/20.11 and openmpi/nvhpc-20.7/4.0.4/64) $ mpicc -I${PETSC_DIR}/include -I${PETSC_DIR}/${PETSC_ARCH}/include \ -L${PETSC_DIR}/${PETSC_ARCH}/lib \ -o ex19 ${PETSC_DIR}/src/snes/tutorials/ex19.c -lpetsc
In the Slurm script include:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${PETSC_DIR}/${PETSC_ARCH}/lib srun ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres > ex19_1.tmp 2>&1
The result can then be compare to the expected result (see ${PETSC_DIR}/src/snes/tutorials/output/). The above will work when performing a minimal build. If you are installing additional packages then you will need to link against additional libraries.
Additional notes
For some builds you will need to run the PETSc configure script and then modify the makefiles for your purposes and then run make all. This approach has proven successful for building a multi-threaded version of MUMPS.
See tips on linking against the Intel MKL and the URL to the Link Line Advisor on that page.
If you encounter any difficulties with PETSc then please send an email to [email protected] or attend a help session.