The Tiger cluster has two parts: tigercpu, an HPE Apollo cluster comprised of 408 Intel Skylake CPU nodes and tigergpu, a Dell computer cluster comprised of 320 nVidia P100 GPUs across 80 Broadwell nodes.
On tigercpu, each CPU processor core has at least 4.8 GB of memory. Every 40-core node is interconnected by a Omnipath fabric with oversubscription. There are 24 nodes per chassis all connected with the full bandwidth.
On tigergpu, each GPU processor core has 16 GB of memory. The nodes are interconnected by an Intel Omnipath fabric. Each GPU is on a dedicated x16 PCI bus. The nodes all have 2.9TB of NVMe connected scratch as well as 256G RAM. The CPUs are Intel Broadwell e5-2680v4 with 28 cores per node. View a dashboard of GPU utilization.
System Configuration and Usage
The head nodes, tigercpu and tigergpu, should be used for interactive work only, such as compiling programs, and submitting jobs as described below. No jobs should be run on the head node other than brief tests that last no more than a few minutes. Where practical, we ask that you entirely fill the nodes so that CPU core fragmentation is minimized.
Please remember that these are shared resources for all users.
Jobs can be submitted for either portion of the Tiger system from either head node, but it is best to compile programs on the head node associated with the portion of the system where the program will run. That is, compile GPU jobs on tigergpu and non-GPU jobs on tigercpu. Running a job on the GPU nodes requires additional specifications in the job script. Refer to the Tiger Tutorial for instructions and examples.
Tiger will be down for maintenance the second Tuesday of the month from 6-10 AM.