Tiger

tiger-logo-machine-room

The Tiger cluster has two parts:  tigercpu, an HPE Apollo cluster comprised of 392 Intel Skylake CPU nodes and tigergpu, a Dell computer cluster comprised of 320 nVidia P100 GPUs across 80 Broadwell nodes.

On tigercpu, each CPU processor core has at least 4.8 GB of memory. Every 40-core node is interconnected by a Omnipath fabric with oversubscription. There are 24 nodes per chassis all connected with the full bandwidth.

On tigergpu, each GPU processor core has 16 GB of memory. The nodes are interconnected by an Intel Omnipath fabric. Each GPU is on a dedicated x16 PCI bus.  The nodes all have 2.9TB of NVMe connected scratch as well as 256G RAM. The CPUs are Intel Broadwell e5-2680v4 with 28 cores per node.

System Configuration and Usage

General Guidelines

The head nodes, tigercpu and tigergpu, should be used for interactive work only, such as compiling programs, and submitting jobs as described below. No jobs should be run on the head node other than brief tests that last no more than a few minutes.  Where practical, we ask that you entirely fill the nodes so that CPU core fragmentation is minimized.

Please remember that these are shared resources for all users.

Running Jobs

Jobs can be submitted for either portion of the Tiger system from either head node, but it is best to compile programs on the head node associated with the portion of the system where the program will run.  That is, compile GPU jobs on tigergpu and non-GPU jobs on tigercpu.  Running a job on the GPU nodes requires additional specifications in the job script.  Refer to the Tiger Tutorial for instructions and examples.

Maintenance Window

Tiger will be down for maintenance the second Tuesday of the month from 6-10 AM.

Hardware Configuration

  Processor
Speed
Nodes Cores
per Node
Memory
per Node
Total Cores Inter-connect Performance:
Theoretical
TigerGPU
Dell Linux Cluster
2.4 GHz Xeon Broadwell
E5-2680 v4
80 28 720 GB 2240   Omnipath 1500 TFLOPS
  1328 MHz P100   4 GPU/node 16 GB/CPU 320 GPUs Omnipath 1504 TFLOPS
TigerCPU
HPE Linux Cluster
2.4 GHz Skylake. 392 40 192 GB 15680 Omnipath xx TFLOPS

Distribution of CPU and memory

There are 15,680 processors available, 40 per node. Each node contains at least 192 GB of memory (4.8 GB per core). The nodes are assembled into 24 node chassis where each chassis has a 1:1 Omnipath connection. There is oversubscription between chassis at 2:1.
 
There are also 40 nodes with memory of 768 GB (19 GB per core). These larger memory nodes also have SSD drives for faster I/O locally.
The nodes are all connected through Omnipath switches for MPI traffic, GPFS, and NFS I/O and over a Gigabit Ethernet for other communication.

 

Job Scheduling (QOS parameters)

All jobs must be run through the scheduler on Tiger.

Jobs are prioritized through the Slurm scheduler based on a number of factors: job size, run times, node availability, wait times, and percentage of usage over a 30 day period as well as a fairshare mechanism to provide access for large contributors. The policy below may change as the job mix changes on the machine.

Jobs will move to the test, short, short, medium, or long quality of service as determined by the scheduler. They are differentiated by the wallclock time requested as follows:

QOS Time Limit Jobs per user Cores per Job Cores Available
test 1 hour 2 jobs 2048 cores 9472 cores
vshort 6 hours 64 jobs 2048 cores 8512 cores
short 24 hours 64 jobs 2048 cores 8512 limit
medium 72 hours 64 jobs 512 cores 7104 cores
vlong 144 hours
(6 days)
64 jobs 256 cores 4736 cores
 
In most cases, these are the maximum numbers and limits may be placed if demand requires. Use the "qos" command to view the actual values in effect.

Recommended File System Usage (/home, /scratch, /tigress)

/home (shared via NFS to all the compute nodes) is intended for scripts, source code, executables and small static data sets that may be needed as standard input/configuration for codes.

/scratch/gpfs is intended for dynamic data that requires higher bandwidth I/O. Files areNOT backed up so this data should be moved to persistent storage as soon as it is no longer needed for computations. These files are cleaned nightly to purge files older than 180 days.
 
/tigress (shared using GPFS) is intended for more persistent storage and should provide high bandwidth i/o (8 GB/s aggregate bandwidth for jobs across 16 or more nodes). Users are provided with a default quota of 512 GB when they request a directory in this storage, and that default can be increased by requesting more. We do ask people to consider what they really need, and to make sure they regularly clean out data that is no longer needed since this filesystem is shared by the users of all our systems. See /tigress Usage Guidelines for more information.
 
/scratch (local to each compute node - 1.8 TB available on each node) is intended for data local to each task of a job, and it should be cleaned out at the end of each job. This is the fastest storage for access. Note that these scratch directories will be cleaned nightly to purge files older than 30 days. Please also note that you can use /tmp instead of /scratch, which actually uses the same space. The advantage here is that when your job completes it automatically removes this /tmp space whereas /scratch would need to be manually cleaned.