Stellar

Stellar is a heterogeneous cluster composed of Intel and AMD nodes. The cluster was built to support large-scale parallel jobs for researchers in astrophysical sciences, plasma physics, physics, chemical & biological engineering and atmospheric & oceanic sciences.

Diagram showing Stellar's multiple head, compute, and gpu nodes, as well as the file system and a stellar-bigmem node.

Schematic diagram of the Stellar cluster.

How to Access the Stellar Cluster

To use Stellar you have to request an account and then log in through SSH.

  1. Requesting Access to Stellar

    PU and CIMES
    Access to the large clusters like Stellar is granted on the basis of brief faculty-sponsored proposals. See section titled For large clusters: Submit a proposal or contribute for details. If, however, you are part of a research group with a faculty member who has contributed to or has an approved project on Stellar, that faculty member can sponsor additional users by sending a request to [email protected]. Any non-Princeton user must be sponsored by a Princeton faculty or staff member for a Research Computer User (RCU) account.

    PPPL
    Please complete this form.
  2. Logging into Stellar

    Once you have been granted access to Stellar, you can connect by opening an SSH client and using the SSH command as detailed below.

    For PU and PPPL, connect to the Intel login node (VPN required from off-campus):

    $ ssh <YourNetID>@stellar.princeton.edu


    For CIMES, connect to the AMD login node (VPN required from off-campus):

    $ ssh <YourNetID>@stellar-amd.princeton.edu


    If you prefer to navigate Stellar through a graphical user interface rather than the Linux command line, Stellar has a web portal option called MyStellar (VPN required from off-campus):

    https://mystellar.princeton.edu

    The web portal enables easy file transfers and interactive jobs with Jupyter. One can also launch a graphical desktop

Stellar OnDemand
  1. For more on how to SSH, see the Knowledge Base article Secure Shell (SSH): Frequently Asked Questions (FAQ). If you have trouble connecting then see our SSH page.

How to Use the Stellar Cluster

Since Stellar is a Linux system, knowing some basic Linux commands is highly recommended. For an introduction to navigating a Linux system, view the material associated with our Intro to Linux Command Line workshop. 

Using Stellar also requires some knowledge on how to properly use the file systems, environment modules, and the Slurm job scheduler. For an introduction to navigating Princeton's High Performance Computing systems, view our Guide to Princeton's Research Computing Clusters. Additional information specific to Stellar's file system, priority for job scheduling, etc. can be found below.

To work with visualizations, or applications that require graphical user interfaces (GUIs), use Stellar's visualization nodes.

To attend a live session of either workshop, see our Trainings page for the next available workshop.

For more resources, see our Support - How to Get Help page.

Important Guidelines

All users are required to read and abide by the Stellar usage guidelines:

Login Nodes

The login nodes, stellar-intel and stellar-amd, should be used for interactive work only such as compiling programs and submitting jobs as described below. Please remember that these are shared resources for all users. No jobs should be run on the login nodes with the exception of brief tests that last no more than a few minutes and use only a few CPU-cores. Where practical, we ask that you entirely fill the nodes so that CPU core fragmentation is minimized. For this cluster, stellar, that means multiples of 96 cores.

Use the "snodes" command to see the number of available nodes. Nodes have quad sockets with 24 cores/socket per node and 8GB/core memory. The back end network is 100Gb Infiniband, HDR100.

The /tigress and /projects directories are mounted on the login node (stellar) as well as the compute nodes over NFS. This is for access to data and software built for projects.

Hardware Configuration

ProcessorNodesCores per NodeMemory per NodeMax Instruction Set
2.9 GHz Intel Cascade Lake29696768 GBAVX-512
2.6 GHz AMD EPYC Rome187128512 GBAVX2
[GPU Nodes]
2.6 GHz AMD EPYC Rome
6*128550 GBAVX2
[GPU Node]
2.75 GHz AMD EPYC Milan
1**561000 GBAVX2

*There are 2 GPUs per node each with 40 GB of GPU memory.
**There are 8 A100 SXM GPUs per node each with 40 GB of GPU memory.

Nodes on the Intel side will consist of PU-only and PPPL-only nodes. Think of a Venn diagram here with PU and PPPL circles.  There will be an intersection of some number of nodes so that either side can expand. Those nodes will be weighted differently so they are the very last to be assigned.

Each GPU is an NVIDIA A100 and has 40 GB of memory. The nodes of stellar are connected with HDR 100 Infiniband. Run the "shownodes" command for additional information about the nodes. There is one large-memory node (4 TB) available to CIMES users that is not mentioned in the table above. These may only be used for jobs that utilize more than 460 GB of memory. Please write to [email protected] for more information. For more technical details about the Stellar cluster, see the full version of the hardware systems table.

Slurm

The default memory allocation on Stellar is 7500 MB per core (--mem-per-cpu=7500M). Relying on the default value will work nicely for the Intel nodes but it can cause problems for the AMD nodes since they offer 512/128=4 GB per core, which translates to --mem-per-cpu=4000M (which is not equal to --mem-per-cpu=4G since 1 MB is 1024 KB in Slurm memory directives). Be sure to explicitly set the memory in Slurm scripts for the AMD nodes. Failure to do this may result in the following error message:

sbatch: error: Batch job submission failed: Requested node configuration is not available

Learn more about Slurm scripts and memory.

Project Accounting

Scheduling for PPPL and CIMES is done based on project. Users in these groups should add the following Slurm directive to all scripts:

#SBATCH -A <account-name>

Run the "sshare" command to see the different project names.

Job Scheduling (QOS Parameters)

All jobs must be run through the Slurm scheduler. If a job exceeds any of the limits below, it will be held until it is eligible to run.

CPU Jobs

QOSTime LimitJobs per UserCores per UserCores AvailableMax Jobs Submit
stellar-debug30 minutes244161000010
cimes-short24 hours1023040N/AN/A
cimes-medium48 hours220484096N/A
cimes-long7 days220484096N/A
pppl-short24 hours204096N/AN/A
pppl-medium48 hours420484096N/A
pppl-long7 days320484096N/A
pu-short24 hours125000N/A12
pu-medium48 hours4300076805
pu-long7 days325004416N/A

Use the "qos" command to see the latest values for the table above.

GPU Jobs

QOSTime LimitJobs per UserGPUs per UserMax Nodes per UserMax Jobs Submit
gpu7 days282N/A

Use the "qos" command to see the latest values for the table above.

Serial Partition

Any job submitted to the PU or PPPL partition requesting 47 or fewer CPU-cores will be assigned to the serial queue. Jobs in this queue will have the lowest priority of all jobs since the cluster is intended for multinode jobs. If you need to run a large number of serial jobs (47 cores or less) then you should consider moving that work to another cluster such as Della.

Compiler Flags and Math Libraries

Intel Nodes

The Intel nodes feature Cascade Lake processors with AVX-512 as the highest instruction set. As a starting point, consider using these optimization flags when compiling a C++ code, for instance:

$ ssh <YourNetID>@stellar-intel.princeton.edu
$ module load intel/2021.1.2
$ icpc -Ofast -xCORE-AVX512 -o mycode mycode.cpp

The Intel Math Kernel Library (MKL) is automatically loaded as a module when an Intel compiler module is loaded.

For GCC:

$ module load gcc-toolset/10
$ g++ -Ofast -march=cascadelake -o mycode mycode.cpp

AMD Nodes

The AMD nodes feature the EPYC processor with AVX2 as the highest instruction set. See the Quick Reference Guide by AMD for compiler flags for different compilers (AOCC, GCC, Intel) and the AOCC user guide. As a starting point, consider using these optimization flags when compiling a C++ code, for instance:

$ ssh <YourNetID>@stellar-amd.princeton.edu
$ module load aocc/3.0.0 aocl/aocc/3.0_6
$ clang++ -Ofast -march=native -o mycode mycode.cpp

For a parallel Fortran code:

$ ssh <YourNetID>@stellar-amd.princeton.edu
$ module load aocc/3.0.0 aocl/aocc/3.0_6 openmpi/aocc-3.0.0/4.1.0
$ mpif90 -Ofast -march=native -o hw hello_world_mpi.f90

Load the aocl module to make available the BLIS and libFLAME linear algebra libraries by AMD as well as FFTW3 and ScaLAPACK. Excellent performance was found for the High-Performance LINPACK benchmark using GCC and these libraries.

If you wish to use the Intel compiler for the AMD nodes then consider these flags:

$ module load intel/2021.1.2
$ icpc -Ofast -march=core-avx2 -o mycode mycode.cpp

Use the -march option above if you encounter the following error message:

Please verify that both the operating system and the processor support Intel(R) X87, CMOV, MMX,
FXSAVE, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, POPCNT, AVX and F16C instructions.

Environment Modules

Loading Modules

The environment modules that you load define part of your software environment which plays a role in determining the results of your code. Run the "module avail" command to see the available modules. For numerous reasons including scientific reproducibility, when loading an environment module you must specify the full name of the module. This can be done using module load, for example:

$ module load intel/19.1.1.217

You will encounter an error if you do not specify the full name of the module:

$ module load anaconda3
ERROR: No default version defined for 'anaconda3'
$ module load anaconda3/2020.11
$ python --version
3.8.5

Notable Modules and Modules to Avoid

  • aocc/<version> makes the AMD compilers available
  • aocl/<compiler>/<version> makes the AMD math libraries available
  • cmake/3.18.2 provides a newer CMake over the system version (3.11.4)
  • gcc/4.85 provides an older GNU Compiler Collection (GCC); it should only be used in rare cases
  • gcc/8.3.1 is equivalent to using the system GCC
  • gcc-toolset/10 makes GCC 10.2.1 available (use this when the system GCC is insufficient)
  • nvhpc/21.1 provides the NVIDIA compilers and libraries (the compilers replace PGI)
  • rh/devtoolset/7 makes GCC 7.3.1 available; it should be avoided in favor of the system GCC

Module Aliases

If you would rather use short aliases instead of full module names then see the environment modules page.

Software

The software environment on Stellar is very similar to the other Research Computing clusters. See the general documentation for Princeton University Research Computing. If you find that you need software packages that are not installed on Stellar then please send a request via e-mail to [email protected].

Anaconda Python

The Anaconda Python distribution should be used when working with Python on Stellar:

$ module avail anaconda3
$ module load anaconda3/2021.11
$ python --version

See our Python page for more information on using the Anaconda Python distribution on the Research Computing clusters. One may also consider installing Miniconda. We do not provide a Python 2 anaconda module on Stellar since that version of the language has been unsupported for over a year.

System Python

The system Python is available but it should be avoided in favor of the Anaconda Python distribution which provides optimizations for our hardware. The system Python exists largely for the system administrators to install software. These commands illustrate its use:

$ python
-bash: python: command not found
$ python3
Python 3.6.8 (default, Nov 15 2020, 11:45:35) 
[GCC 8.3.1 20191121 (Red Hat 8.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
$ python2
Python 2.7.17 (default, Nov 16 2020, 23:55:19) 
[GCC 8.3.1 20191121 (Red Hat 8.3.1-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

Again, for scientific work one should use the Anaconda Python distribution and not the system Python.

MATLAB

Stellar is intended for large multinode jobs so MATLAB is not available on the compute nodes. It is available on the visualization nodes, however. If you use MATLAB on the visualization nodes then please restrict the number of CPU-cores that you use. MATLAB can be used on other clusters such as Della.

Globus

To use Globus to transfer data to the /scratch/gpfs filesystem of Stellar, which is shared with Traverse, use this endpoint:

Princeton Traverse/Stellar Scratch DTN

Visualization Nodes

The Stellar cluster has two dedicated nodes for visualization and post-processing tasks, called stellar-vis1 (for Princeton University researchers) and stellar-vis2 (for Princeton Plasma Physics Lab researchers).

Hardware Details

Both stellar-vis1 and stellar-vis2 nodes each feature 40 CPU-cores, 790 GB of memory, and two NVIDIA V100 GPUs with 32 GB of memory per GPU.

Both nodes have internet access.

How to Use the Visualization Node

Users can connect via SSH with the following command (VPN required if connecting from off-campus)

$ ssh <YourNetID>@stellar-vis1.princeton.edu  # for PU users
$ ssh <YourNetID>@stellar-vis2.princeton.edu  # for PPPL users

but to work with graphical applications on the visualization node, see our guide to working with visualizations and graphical user-interface (GUI) applications.

Note that there is no job scheduler on stellar-vis1 or stellar-vis2, so please be considerate of other users when using this resource. To ensure that the system remains a shared resource, there are limits in place preventing one individual from using all of the resources. You can check your activity with the command "htop -u $USER".

In addition to visualization, the nodes can be used for tasks that are incompatible with the Slurm job scheduler, or for work that is not appropriate for the Della login nodes (such as downloading large amounts of data from the internet).

Big Memory Node

Stellar features a node with AMD CPUs and 4 TB of memory which can be used for a variety of workloads. To use this node, one must specify the partition and a run time limit of 1 hour or more.

For batch jobs:

#SBATCH --mem=1500G
#SBATCH --time=01:00:00
#SBATCH --partition=bigmem

For interactive jobs:

$ salloc --nodes=1 --ntasks=1 --time=1:00:00 --mem=3000G --partition=bigmem

For Jupyter OnDemand (see below), add the following to the "Extra slurm options" field when making the session:

--partition=bigmem

For more on running jobs, see our Slurm webpage.

Jupyter OnDemand

There is a web portal for running Jupyter notebooks at https://mystellar.princeton.edu (VPN is required to connect from off-campus). Follow the directions on our Jupyter page for working with custom Conda environments. Choose "Interactive Apps" then either "Jupyter" or "Jupyter on Vis node". The first choice is for intensive sessions while the latter runs on the appropriate visualization node which is for light work that requires internet access.

Jupyter on mystellar

To request a GPU, enter the following in the field for "Extra slurm options":

--gres=gpu:1

 

Maintenance Window

Stellar will be down for routine maintenance on the second Tuesday of every month from approximately 6 AM to 2 PM. This includes the associated filesystems of /scratch/gpfs, /projects and /tigress. Please mark your calendar. Jobs submitted close to downtime will remain in the queue unless they can be scheduled to finish before downtime (see more). Users will receive an email when the cluster is returned to service.