The Tiger cluster has two parts:  tigercpu, an HPE Apollo cluster comprised of 408 Intel Skylake CPU nodes and tigergpu, a Dell computer cluster comprised of 320 NVIDIA P100 GPUs across 80 Broadwell nodes.

On tigercpu, each CPU processor core has at least 4.8 GB of memory. Every 40-core node is interconnected by a Omnipath fabric with oversubscription. There are 24 nodes per chassis all connected with the full bandwidth.

On tigergpu, each GPU processor core has 16 GB of memory. The nodes are interconnected by an Intel Omnipath fabric. Each GPU is on a dedicated x16 PCI bus. The nodes all have 2.9TB of NVMe connected scratch as well as 256G RAM. The CPUs are Intel Broadwell e5-2680v4 with 28 cores per node. View a dashboard of GPU utilization.

System Configuration and Usage

General Guidelines

The head nodes, tigercpu and tigergpu, should be used for interactive work only, such as compiling programs, and submitting jobs as described below. No jobs should be run on the head node other than brief tests that last no more than a few minutes.  Where practical, we ask that you entirely fill the nodes so that CPU core fragmentation is minimized.

Please remember that these are shared resources for all users.

Running Jobs

Jobs can be submitted for either portion of the Tiger system from either head node, but it is best to compile programs on the head node associated with the portion of the system where the program will run.  That is, compile GPU jobs on tigergpu and non-GPU jobs on tigercpu.  Running a job on the GPU nodes requires additional specifications in the job script.  Refer to the Tiger Tutorial for instructions and examples.

Maintenance Window

Tiger will be down for maintenance the second Tuesday of the month from 6-10 AM.


A schematic diagram of the tiger cluster

Schematic diagram of the Tiger cluster, and schematics of a CPU node and a GPU node

How to word acknowledgement of support and/or use of research computing resources for Publication

"The author(s) are pleased to acknowledge that the work reported on in this paper was substantially performed using the Princeton Research Computing resources at Princeton University which is consortium of groups led by the Princeton Institute for Computational Science and Engineering (PICSciE) and Office of Information Technology's Research Computing."

"The simulations presented in this article were performed on computational resources managed and supported by Princeton Research Computing, a consortium of groups including the Princeton Institute for Computational Science and Engineering (PICSciE) and the Office of Information Technology's High Performance Computing Center and Visualization Laboratory at Princeton University."

Hardware Configuration
Nodes Cores
per Node
per Node
Total Cores Inter-connect Performance:
Dell Linux Cluster
2.4 GHz Xeon Broadwell
E5-2680 v4
80 28 256 GB 2240   Omnipath 1500 TFLOPS
(GPU info) 1328 MHz P100   4 GPU/node 16 GB/CPU 320 GPUs Omnipath 1504 TFLOPS
HPE Linux Cluster
2.4 GHz Skylake. 408 40 192 GB 16320 Omnipath xx TFLOPS

Distribution of CPU and memory

There are 16,320 processors available, 40 per node. Each node contains at least 192 GB of memory (4.8 GB per core). The nodes are assembled into 24 node chassis where each chassis has a 1:1 Omnipath connection. There is oversubscription between chassis at 2:1.
There are also 40 nodes with memory of 768 GB (19 GB per core). These larger memory nodes also have SSD drives for faster I/O locally.
The nodes are all connected through Omnipath switches for MPI traffic, GPFS, and NFS I/O and over a Gigabit Ethernet for other communication.


Job Scheduling (QOS parameters)

The values in the table below may not be accurate since changes are made regularly to maximize job throughput. Use the "qos" command to see the currently active values.

Jobs are prioritized through the Slurm scheduler based on a number of factors: job size, run times, node availability, wait times, and percentage of usage over a 30 day period as well as a fairshare mechanism to provide access for large contributors. The policy below may change as the job mix changes on the machine.

Jobs will move to the test, vshort, short, medium, or long quality of service as determined by the scheduler. They are differentiated by the wallclock time requested as follows:

QOS Time Limit Jobs per user Cores per Job Cores Available
test 1 hour 2 jobs no limit 15560 cores
vshort 6 hours 64 jobs no limit no limit
short 24 hours 32 jobs 4000 cores 13360 limit
medium 72 hours 16 jobs 2000 cores 7840 cores
long 144 hours
(6 days)
8 jobs 1000 cores 5520 cores
In most cases, these are the maximum numbers and limits may be changed if demand requires. Use the "qos" command to view the actual values in effect.
Recommended File System Usage (/home, /scratch, /tigress)

/home (shared via NFS to all the compute nodes) is intended for scripts, source code, executables and small static data sets that may be needed as standard input/configuration for codes.

/scratch/gpfs is intended for dynamic data that requires higher bandwidth I/O. Files are NOT backed up so this data should be moved to persistent storage as soon as it is no longer needed for computations. Please remove files on /scratch/gpfs that you no longer need.
/tigress (shared using GPFS) is intended for more persistent storage and should provide high bandwidth i/o (8 GB/s aggregate bandwidth for jobs across 16 or more nodes). Users are provided with a default quota of 512 GB when they request a directory in this storage, and that default can be increased by requesting more. We do ask people to consider what they really need, and to make sure they regularly clean out data that is no longer needed since this filesystem is shared by the users of all our systems. See /tigress Usage Guidelines for more information.
/scratch (local to each compute node - 1.8 TB available on each node) is intended for data local to each task of a job, and it should be cleaned out at the end of each job. This is the fastest storage for access. Note that these scratch directories will be cleaned nightly to purge files older than 30 days. Please also note that you can use /tmp instead of /scratch, which actually uses the same space. The advantage here is that when your job completes it automatically removes this /tmp space whereas /scratch would need to be manually cleaned.