Della is a 4 rack Intel computer cluster, originally acquired through a joint effort of Astrophysics, the Lewis-Sigler Institute for Integrative Genomics, PICSciE and OIT. It is intended as a platform for running both parallel and serial production jobs.  The system has grown over time and includes groups of nodes using different generations of Intel processor technology.  For hardware details, see the Hardware Configuration information below.  All  nodes are connected via FDR Infiniband high bandwidth low latency network.


System Configuration and Usage

General Guidelines

The head node, della5, should be used for interactive work only, such as compiling programs, and submitting jobs as described below. No jobs should be run on the head node, other than brief tests that last no more than a few minutes.  Where practical, we ask that you entirely fill the nodes so that CPU core fragmentation is minimized.

There is also a web portal at which provides access to the cluster through a web browser.  This enables easy file transfers and interactive jobs: RStudio, Jupyter, Stata and MATLAB. A VPN is required to access the web portal from off-campus.

Maintenance Window

Della will be down for maintenance the second Tuesday of the month from 6-10 AM.

How to word acknowledgement of support and/or use of research computing resources for Publication

"The author(s) are pleased to acknowledge that the work reported on in this paper was substantially performed using the Princeton Research Computing resources at Princeton University which is consortium of groups led by the Princeton Institute for Computational Science and Engineering (PICSciE) and Office of Information Technology's Research Computing."

"The simulations presented in this article were performed on computational resources managed and supported by Princeton Research Computing, a consortium of groups including the Princeton Institute for Computational Science and Engineering (PICSciE) and the Office of Information Technology's High Performance Computing Center and Visualization Laboratory at Princeton University."

Hardware Configuration
  Processor Speed Nodes Core per Node Memory per Node Total Cores Inter-connect Performance: Theoretical
Dell Linux Cluster

2.5 GHz Ivybridge
2.6 GHz Haswell
2.4 GHz Skylake
2.4 GHz Broadwell
2.8 GHz Cascade Lake



128 GB
128 GB
128 GB
192 GB
190 GB

6400 FDR Infiniband  

The nodes are all connected with FDR Infiniband. 

Job Scheduling (QOS parameters)

The values in the table below may not be accurate since changes are made regularly to maximize job throughput. Use the "qos" command to see the currently active values.

All jobs must be run through the Slurm scheduler on Della. If a job would exceed any of the limits below, it will be held until it is eligible to run. Jobs should not specify the qos into which it should run, allowing the Slurm scheduler to distribute the jobs accordingly.

Jobs will be assigned a quality of service (QOS) according to the length of time specified for the job:

QOS Time Limit Jobs per user Cores per User Cores Available
test 61 minutes 2 jobs [30 nodes] no limit
short 24 hours 350 jobs 400 cores no limit
medium 72 hours 200 jobs 300 cores 1600 cores
vlong 144 hours
(6 days)
50 jobs 200 cores 1300 cores

Jobs are further prioritized through the Slurm scheduler based on a number of factors: job size, run times, node availability, wait times, and percentage of usage over a 30 day period (fairshare). Also, these values reflect the minimum limits in effect and the actual values may be higher. Please use the "qos" command to see the limits in effect at the current time.

Recommended File System Usage (/home, /scratch, /tigress)

/home (shared via NFS to all the compute nodes) is intended for scripts, source code, executables and small static data sets that may be needed as standard input/configuration for codes.

/scratch/network (shared via NFS to all the compute nodes) is intended for dynamic data that doesn't require high bandwidth i/o such as storing final output for a compute job. You may a create a directory /scratch/network/myusername, and use this to place your temporary files. Files are NOT backed up so this data should be moved to persistent storage once it is no longer needed for continued computation. Any files left here will be removed after 60 days.

/scratch/gpfs (shared via GPFS to all the compute nodes, 800 TB) is intended for dynamic data that requires higher bandwidth i/o. Files are NOT backed up so this data should be moved to persistent storage as soon as it is no longer needed for computations.

/tigress (shared via GPFS to all TIGRESS resources, 6 PB) is intended for more persistent storage and should provide high bandwidth i/o (20 GB/s aggregate bandwidth for jobs across 16 or more nodes). Users are provided with a default quota of 512 GB when they request a directory in this storage, and that default can be increased by requesting more. We do ask people to consider what they really need, and to make sure they regularly clean out data that is no longer needed since this filesystem is shared by the users of all our systems.

/scratch (local to each compute node) is intended for data local to each task of a job, and it should be cleaned out at the end of each job. Nodes have about 130 to 1400 G available depending on the node.

Running Third-party Software

If you are running 3rd-party software whose characteristics (e.g., memory usage) you are unfamiliar with, please check your job after 5-15 minutes using 'top' or 'ps -ef' on the compute nodes being used. If the memory usage is growing rapidly, or close to exceeding the per-processor memory limit, you should terminate your job before it causes the system to hang or crash. You can determine on which node(s) your job is running using the "scontrol show job <jobnumber>" command.