Coming Soon - Stellar
Stellar, a heterogeneous cluster with AMD and Intel processors, is being built to support large-scale parallel jobs predominantly for use by researchers in astrophysical sciences, plasma physics, physics, chemical and biological engineering and atmospheric and oceanic sciences.
Stellar is Operating in Limited Access Mode
A portion of the Intel nodes and the /scratch/gpfs file system have been installed, and a number of researchers have accounts on the system to facilitate the migration off of Perseus and Eddy. Decommissioning those systems will enable the installation of the remainder of the Stellar nodes. See the "Perseus/Eddy to Stellar Migration Checklist" below for important information. All researchers should keep in mind that the Stellar cluster is operating in a beta state. If you encounter any problems please write to email@example.com.
To use the Stellar cluster you have to enable your Princeton Linux account, request an account on Stellar, and then log in through SSH.
- Enabling Princeton Linux Account
Stellar is a Linux cluster. If your Stellar account is your first Princeton OIT Linux account, then you need to enable your Linux account (link requires VPN if off-campus). If you need help, the process is described in the Knowledge Base article Unix: How do I enable/change the default Unix shell on my account? For more on Unix, you can see Introduction to Unix at Princeton. Once you have access, you should not need to register again unless your account goes unused for more than six months.
- Requesting Access to Stellar
Access to the large clusters like Stellar is granted on the basis of brief faculty-sponsored proposals. See section titled For large clusters: Submit a proposal or contribute for details.
If, however, you are part of a research group with a faculty member who has contributed to or has an approved project on Stellar, that faculty member can sponsor additional users by sending a request to firstname.lastname@example.org. Any non-Princeton user must be sponsored by a Princeton faculty or staff member for a Research Computer User (RCU) account.
- Logging into Stellar
Once you have been granted access to Stellar, you should be able to SSH into it using the command below:
$ ssh <YourNetID>@stellar.princeton.edu
For more on how to SSH, see the Knowledge Base article Secure Shell (SSH): Frequently Asked Questions (FAQ).
Since Stellar is a Linux system, knowing some basic Linux commands is highly recommended. For an introduction to navigating a Linux system, view the material associated with our Intro to Linux Command Line workshop.
Using Stellar also requires some knowledge on how to properly use the file system, module system, and how to use the scheduler that handles each user's jobs. For an introduction to navigating Princeton's High Performance Computing systems, view the material associated with our Getting Started with the Research Computing Clusters workshop. Additional information specific to Stellar's file system, priority for job scheduling, etc. can be found below.
Please remember that these are shared resources for all users.
The head node, stellar, should be used for interactive work only, such as compiling programs, and submitting jobs as described below. No jobs should be run on the head node, other than brief tests that last no more than a few minutes. Where practical, we ask that you entirely fill the nodes so that CPU core fragmentation is minimized. For this cluster, stellar, that means multiples of 96 cores.
Use the "snodes" command to see the number of available nodes. Nodes have quad sockets with 24 cores/socket per node and 8GB/core memory. The back end network is 100Gb Infiniband, HDR100.
The /tigress and /projects directories are mounted on the login node (stellar) as well as the compute nodes over NFS. This is for access to data and software built for projects.
If you are coming from the Perseus/Eddy cluster(s) to Stellar then make sure you do these things:
- Recompile your code. Stellar runs on the RHEL8 operating system while Perseus and Eddy used RHEL7. Additionally, all of the various software libraries such at MPI and HDF5 have been re-built from source specifically for the hardware and network fabric of Stellar. Because of this, all users need to recompile their code from source using the compilers and libraries provided by the new environment modules.
- Use full environment module names. One can no longer use "module load anaconda3", for instance. Instead the full name of the module must be specified (e.g., module load anaconda3/2020.11). Use the "module avail" command to see the available environment modules.
- Be aware that /scratch/gpfs is shared between Stellar and Traverse. Users must be careful not to overwrite files by using the same job path on both clusters.
The Intel nodes feature Cascade Lake processors with AVX-512 as the highest instruction set. As a starting point, consider using these optimization flags when compiling a C++ code, for instance:
$ ssh <YourNetID>@stellar-intel.princeton.edu $ module load intel/2021.1.2 $ icpc -Ofast -xCORE-AVX512 -o mycode mycode.cpp
The Intel Math Kernel Library (MKL) is automatically loaded as a module when an Intel compiler module is loaded.
$ module load gcc-toolset/10 $ g++ -Ofast -march=cascadelake -o mycode mycode.cpp
The AMD nodes feature the EPYC processor with AVX2 as the highest instruction set. See the Quick Reference Guide by AMD for compiler flags for different compilers (AOCC, GCC, Intel) and the AOCC user guides. As a starting point, consider using these optimization flags when compiling a C++ code, for instance:
$ ssh <YourNetID>@stellar-amd.princeton.edu $ module load aocc/3.0.0 aocl/aocc/3.0_6 $ clang++ -Ofast -march=native -o mycode mycode.cpp
For a parallel Fortran code:
$ ssh <YourNetID>@stellar-amd.princeton.edu $ module load aocc/3.0.0 aocl/aocc/3.0_6 openmpi/aocc-3.0.0/4.1.0 $ mpif90 -Ofast -march=native -o hw hello_world_mpi.f90
Load the aocl module to make available the BLIS and libFLAME linear algebra libraries by AMD as well as FFTW3 and ScaLAPACK. Excellent performance was found for the High-Performance LINPACK benchmark using GCC and these libraries.
If you wish to use the Intel compiler for the AMD nodes then consider these flags:
$ module load intel/2021.1.2 $ icpc -Ofast -march=core-avx2 -o mycode mycode.cpp
Use the -march option above if you encounter the following error message:
Please verify that both the operating system and the processor support Intel(R) X87, CMOV, MMX, FXSAVE, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, POPCNT, AVX and F16C instructions.
The environment modules that you load define part of your software environment which plays a role in determining the results of your code. Run the "module avail" command to see the available modules. For numerous reasons including scientific reproducibility, when loading an environment module you must specify the full name of the module. This can be done using module load, for example:
$ module load intel/184.108.40.206
You will encounter an error if you do not specify the full name of the module:
$ module load anaconda3 ERROR: No default version defined for 'anaconda3' $ module load anaconda3/2020.11 $ python --version 3.8.5
If you would rather use short aliases instead of full module names then see the environment modules page.
To use Globus to transfer data to the /scratch/gpfs filesystem of Stellar, which is shared with Traverse, use this endpoint:
Princeton Traverse/Stellar Scratch DTN
There are two dedicated nodes for visualization and data analysis:
$ ssh <YourNetID>@stellar-vis1.princeton.edu # PU $ ssh <YourNetID>@stellar-vis2.princeton.edu # PPPL
These nodes support TurboVNC and they each offer two NVIDIA V100 GPUs for GPU-enabled software. Please use these nodes for visualization and data analysis instead of the stellar head nodes.