A Few Caveats
These instructions assume you:
- Are Princeton University faculty, student, or staff, or you have an RCU account
- Are connecting...
- (if on-campus) from the campus wireless eduroam network, or a wired campus connection, or through Nobel
- (if off-campus) with GlobalProtect VPN, or through Nobel. You can install the GlobalProtect VPN on your laptop. Make sure you connect via the GlobalProtect app on your laptop before ssh-ing to Adroit or another cluster.
- Are connecting with Duo Authentication
- All of the clusters from both on campus and off campus now require two-factor authentication via Duo. If you need help getting this set up, contacting the OIT Support and Operations Center will be your best course of action. You can also see OIT's resources for using Duo here.
- Upon connecting, you can request a push to a cell phone application, a text with a passcode, or you can enter a generated pass code with a soft key created by the Duo application on your cell phone.
- If you use a system that respects a standard '~/.ssh/config' file, you can use a multiplexing solution.
- Have an account on the system you're looking to connect to. (Note: You will need to login using your university credentials).
To see how to get an account, visit the page for your specific system within the Systems submenu.
In order to connect to the university computing clusters, you will need an SSH (secure shell) client, a piece of software for establishing secure connections to remote machines.
On MacOS and Linux, the default Terminal application has such a client built-in. No download is necessary!
On Windows 10 machines, there is also an SSH-enabled client. (If for some reason you don't have SSH enabled under Windows 10, follow this guide to enable it).
On Windows 8 machines, you'll need a client. One option is PuTTY and another is Mobaxterm.
To connect to a cluster via SSH on Linux, macOS, or Windows 10:
(Connecting on Windows 8 is discussed separately below.)
Access a command-line on your laptop
- Linux: open a Terminal window (usually by pressing Ctrl+Alt+t --- i.e. press and hold Ctrl, and without releasing it, press and hold Alt, and then without releasing either of those two keys, type 't')
- macOS: open a Terminal window (by launching the Terminal app located in /Applications/Utilities)
- Windows 10: Windows 10 has a few different ways to access a command-line interface
- PowerShell or Command Prompt --- these are a couple of Window-native (DOS-like) command-line environments. To access them, press Win+x (i.e., while holding down the Windows key, type 'x'). This opens the so-called "Power Users" menu in Windows. Select either "Command Prompt" or "PowerShell" (usually you'll see one or the other, depending on which Windows updates you've installed). Note that the Command Prompt and the PowerShell are not equivalent command-line environments in general, but for the purposes of using SSH, they work the same and either will do.
- Windows Subsystem for Linux (WSL) --- WSL is an optional feature you can enable that furnishes a genuine Linux command-line within Windows. If you have WSL enabled, then you should have an SSH client on the Linux side by default, just as you would in a regular Linux operating system.
- SSH into the cluster
The syntax for using ssh is the same in all of the above scenarios. Remember to make sure you're on a Princeton VPN, and then on the command line you accessed in the previous step, type
So for instance, if your NetID is abc1, and you'd like to connect to the Adroit cluster, you would type:
If this is your first time connecting to this cluster from whatever computer you're on, you will then see a comment about a fingerprint along with the question
Are you sure you want to continue connecting (yes/no)?Answer 'yes' and hit Enter.
You will now be prompted for your usual Princeton password. Enter it.
*NOTE*: there are no asterisks or dots to indicate how many characters you've typed, so if you think you've made a typo, hit Backspace many times and enter the password from scratch.
*NOTE*: If you've previously connected to a cluster and set up SSH keys, you will be connected without being prompted for a password first.
Depending on whether you're on a VPN and how it's configured, you may now be prompted to enter a Duo code (if you are, do so).
That's it -- you should now be connected to a cluster and see its Linux command-line prompt (e.g. [adroit4:~ <yourNetID>]$) instead of the one for your local computer.
- Ending the SSH connection to Adroit
To close the SSH connection, simply type
exitat the Adroit command line and press Enter. This should close the connection, and your local computer's command-line prompt should reappear.
To connect to a cluster via SSH on Windows 8:
Windows 8 does not have a built-in SSH client, nor does it have a WSL that offers native access to a Linux command line. So if you run Windows 8 and want to make an SSH connection from within Windows (as opposed to, say, by running Linux inside VirtualBox and connecting to Adroit from within that virtual Linux session), then you need to install a separate SSH client.
We recommend either PuTTY or MobaXTerm. These lightweight clients (MobaXTerm has more features) have a graphical interface to initiate SSH connections.
This video shows briefly how to connect to a remote server using Putty (starting at timestamp 0:54). In the field for "Host Name", enter adroit.princeton.edu (leave the port number as 22). When you connect and it prompts you "login as: ", enter your NetID and then your password (again, you may be asked to Duo authenticate after entering your password). You should then be logged into Adroit and see its Linux command-line prompt. For more detailed information about Putty, consult this guide.
MobaXTerm should work fairly similarly.
If you have trouble connecting then see this page.
Example of Connecting via SSH Using Terminal
Once you launch an instance of Terminal, you'll be at a command prompt on your local machine (i.e., on your computer) that looks something like this:
benjaminhicks ~/hpc_beginning_workshop $
The '$' is an indication that you're ready to enter a command.
To connect to a cluster, the general address looks like this, where you replace the <>'s with the needed content:
To connect to Adroit, as a user with the NetID bhicks, I'd type something like this
benjaminhicks ~/hpc_beginning_workshop $ ssh [email protected]
and after hitting Enter, I'd see something like this
nat-oitwireless-inside-vapornet100-c-14666:hpc_beginning_workshop bhicks$ ssh [email protected] Warning: the ECDSA host key for 'adroit.princeton.edu' differs from the key for the IP address '18.104.22.168' Offending key for IP in /Users/bhicks/.ssh/known_hosts:88 Matching host key in /Users/bhicks/.ssh/known_hosts:115 Are you sure you want to continue connecting (yes/no)? yes Password: Duo two-factor login for bhicks Enter a passcode or select one of the following options: 1. Duo Push to XXX-XXX-3224 2. Phone call to XXX-XXX-3224 3. Phone call to XXX-XXX-8335 4. SMS passcodes to XXX-XXX-3224 (next code starts with: 1) Passcode or option (1-4): 493203 Success. Logging you in... Last login: Wed Oct 10 09:12:28 2018 from nat-oitwireless-outside-vapornet3-l-14.princeton.edu [adroit4:~ bhicks]$
I'm now remotely connected to adroit4, which is the head node of the cluster!
The shell I used in both cases is one called Bash. It's a particular command line interface that is common across Unix-alike machines.
SSH Keys: ssh without typing passwords
Typing passwords every time you want to connect to a machine or, more annoyingly, every time you want to copy a file to/from a remote machine gets annoying quickly. One solution is to enable passwordless login/remote operations by generating a public/private pair of ssh keys and using them to negotiate the connection. The procedure is explained in this guide.
Staying connected (tmux)
If your SSH connection is suddenly broken then the command you're running terminates. This comes up very frequently when you're connected to Nobel, where tasks are run directly from the command line rather than a job scheduler.
One solution to this problem is tmux. It comes installed on all university clusters, and it lets you start a shell session that, rather than being remote via SSH, lives on the server.
A simple use case would be:
$ ssh <YourNetID>@adroit-vis.princeton.edu $ tmux $ wget https://www.bigdata.org/dataset.tar.gz # ssh connection suddenly breaks! # no problem just reconnect and attach $ ssh <YourNetID>@adroit-vis.princeton.edu $ tmux attach
To detach from your tmux session press ctrl-b, then d. To close a tmux session, run the "exit" command or press 'ctrl+d.' A tmux session will run on the remote server until the server is rebooted or you close them. Anything you run while attached to the tmux session runs in that session, and therefore it is safe from a disconnect.
Nobel has two hosts, "compton" and "davisson", and the session will live on one or the other. You can find the host by looking in the lower right corner of your tmux session window. To find your session again after a disconnect, you'll need to login directly to that host rather than just nobel, i.e. "ssh compton.princeton.edu." Otherwise you'll do "tmux attach" and not find your session.
tmux is a powerful and complex tool. In addition to the simple guide linked above, you might explore the following:
- tmux and other ways to improve your command line skills by Troy Comi of Princeton
- tmux - a very simple beginner's guide
- Beginner’s Guide to Tmux (Feel free to ignore the installation as it's already on the clusters, unless you want to run this on your Mac!)
- A tmux primer
Learn More About a Cluster by Running Commands
Once in a cluster, type each command below and examine the output:
hostname # get the name of the machine you are on whoami # get username of the account date # get the current date and time pwd # print working directory cat /etc/os-release # info about operating system lscpu # info about the CPUs on head node shownodes # info about the compute nodes (7 nodes for myadroit) squeue # which jobs are running or waiting to run qos # quality of service (job partitions and limits) slurmtop # shows a map of cluster usage who # list users on the head node checkquota # view your quota and request more space
Here is example output from the commands above on Adroit:
$ hostname adroit5 $ whoami ceisgrub $ date Tue Feb 21 14:37:44 EST 2023 $ pwd /home/ceisgrub $ cat /etc/os-release NAME="Springdale Open Enterprise Linux" VERSION="8.7 (Modena)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.7" PLATFORM_ID="platform:el8" PRETTY_NAME="Springdale Open Enterprise Linux 8.7 (Modena)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:springdale:enterprise_linux:8.7:GA" HOME_URL="https://springdale.princeton.edu/" BUG_REPORT_URL="https://springdale.princeton.edu/bugzilla" REDHAT_BUGZILLA_PRODUCT="Springdale Open Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.7 REDHAT_SUPPORT_PRODUCT="Springdale Open Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.7" $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 1 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz Stepping: 7 CPU MHz: 3900.000 CPU max MHz: 3900.0000 CPU min MHz: 1200.0000 BogoMIPS: 5800.00 L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 22528K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities $ shownodes NODELIST PART STATE FREE/TOTAL CPUs CPU_LOAD FREE/TOTAL MEMORY FREE/TOTAL GPUs FEATURES adroit-08 class idle 32/32 0.00 379908/384000Mb skylake,intel adroit-09 class idle 32/32 0.91 382191/384000Mb skylake,intel adroit-10 class idle 32/32 0.00 378779/384000Mb skylake,intel adroit-11 class mixed 25/32 0.01 333867/384000Mb skylake,intel adroit-12 class idle 32/32 0.00 382422/384000Mb skylake,intel adroit-13 class mixed 20/32 12.02 254458/384000Mb skylake,intel adroit-14 class allocated 0/32 32.23 344374/384000Mb skylake,intel adroit-15 class idle 32/32 0.00 362568/384000Mb skylake,intel adroit-16 class idle 32/32 0.00 355045/384000Mb skylake,intel adroit-h11g1 gpu mixed 38/40 0.26 671544/770000Mb 2/4 tesla_v100 v100,intel adroit-h11g2 gpu mixed 40/48 1.07 694282/1000000Mb 3/4 nvidia_a100 a100,intel adroit-h11g3 gpu mixed 52/56 0.15 640082/760000Mb 3/4 tesla_v100 v100,intel adroit-h11n1 class idle 128/128 0.00 250613/256000Mb amd,rome adroit-h11n2 all allocated 0/64 56.01 167196/512000Mb intel,ice adroit-h11n3 all mixed 7/64 62.31 197670/512000Mb intel,ice adroit-h11n4 all allocated 0/64 28.02 139797/512000Mb intel,ice adroit-h11n5 all mixed 16/64 3.10 303855/512000Mb intel,ice adroit-h11n6 all mixed 4/64 35.97 270328/512000Mb intel,ice $ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1712982 all gauss.cm dt0998 PD 0:00 1 (QOSGrpCpuLimit) 1711873 all syn_ctrl slala PD 0:00 1 (QOSGrpCpuLimit) 1711872 all syn_ctrl slala PD 0:00 1 (QOSGrpCpuLimit) 1711866 all syn_ctrl slala PD 0:00 1 (QOSGrpCpuLimit) 1711865 all syn_ctrl slala PD 0:00 1 (QOSGrpCpuLimit) 1711853 all cv slala PD 0:00 1 (QOSMaxCpuPerUserLimit) 1711874 all cv slala PD 0:00 1 (Dependency) 1711867 all cv slala PD 0:00 1 (Dependency) 1711864 all cv slala PD 0:00 1 (DependencyNeverSatisfied) 1712972 all Mocha9 dnpham PD 0:00 1 (QOSGrpCpuLimit) 1712979 all DTBP-sca jdeobald R 43:55 1 adroit-h11n4 1712949 all PO xuanhong R 2:43:12 1 adroit-h11n6 1712948 all PO xuanhong R 2:46:24 1 adroit-h11n6 1712771 all test_job sk5339 R 17:55:45 1 adroit-h11n3 1712692 all Fe40 barsukov R 1-00:41:16 1 adroit-14 1712922 all sys/dash nhazra R 3:45:15 1 adroit-h11n6 1712705 all 16Ti_non bw1755 R 23:51:18 1 adroit-h11n4 1712899 all TS3 barsukov R 2:23:44 1 adroit-h11n2 1712984 all gauss.cm dt0998 R 38:18 1 adroit-h11n5 1712983 all gauss.cm dt0998 R 38:53 1 adroit-h11n5 1712999 all poisson_ sf5201 R 8:05 1 adroit-h11n5 1712993 all sys/dash jlca R 13:24 1 adroit-h11n4 1712894 all sys/dash dpmoore R 4:36:59 1 adroit-h11n6 1711863 all syn_ctrl slala R 3:47:14 1 adroit-h11n3 1711862 all syn_ctrl slala R 5:10:43 1 adroit-h11n6 1712913 all sys/dash ec7636 R 3:58:59 1 adroit-h11n6 1712962 all sys/dash ec7636 R 1:29:19 1 adroit-h11n6 1712879 all sys/dash gc0394 R 5:20:59 1 adroit-h11n2 1712903 all sys/dash kaneelil R 4:16:55 1 adroit-h11n6 1712971 all visc_coa kaneelil R 59:27 1 adroit-h11n6 1711419 all sys/dash hajc R 4-13:07:28 1 adroit-h11n5 1712213 all Mocha6 dnpham R 1-06:15:38 1 adroit-h11n2 1711707 all sys/dash hyork R 3-21:44:32 1 adroit-h11n6 1712706 all Mocha7 dnpham R 21:39:50 1 adroit-h11n2 1712624 all Mocha8 dnpham R 1-03:46:59 1 adroit-13 1712967 all Mocha4 dnpham R 1:05:46 1 adroit-h11n4 1712976 class sys/dash gc6782 R 46:40 1 adroit-11 1712946 class sys/dash mo9718 R 2:55:32 1 adroit-11 1712989 class sys/dash dm46 R 20:20 1 adroit-11 1712896 class sys/dash law2 R 4:35:10 1 adroit-11 1711985 class sys/dash vikashm R 2-22:00:10 1 adroit-11 1712901 class sys/dash jm4437 R 4:24:53 1 adroit-11 1712987 gpu sys/dash awtang R 34:47 1 adroit-h11g3 1712953 gpu modular_ tinghanf R 1:46:25 1 adroit-h11g2 1712407 gpu sys/dash xuchenz R 1-17:58:11 1 adroit-h11g1 $ who fcastro pts/0 2023-02-21 13:47 (172.20.217.236) mm5986 pts/8 2023-02-13 14:54 (22.214.171.124) rsouth pts/9 2023-02-21 13:24 (10.9.87.6) ys5910 pts/30 2023-02-21 14:28 (172.21.2.7) tinghanf pts/31 2023-02-21 10:07 (126.96.36.199) xuchenz pts/34 2023-02-21 13:48 (172.20.217.107) zs0806 pts/37 2023-02-21 10:03 (172.21.2.7) dnpham pts/39 2023-02-21 10:14 (10.9.115.115) dnpham pts/40 2023-02-21 10:15 (10.9.115.115) dpmoore pts/44 2023-02-21 13:55 (172.20.216.19) sf5201 pts/45 2023-02-21 13:58 (172.20.216.192) gc6782 pts/47 2023-02-21 10:39 (10.8.20.197) ak6174 pts/48 2023-02-21 11:04 (10.9.92.40) cw1074 pts/52 2023-02-21 14:07 (172.20.210.60) root pts/59 2023-02-21 14:21 (172.21.2.12) jdeobald pts/49 2023-02-21 10:41 (10.8.39.167) cw1074 pts/53 2023-02-21 14:10 (172.20.210.60) rt6814 pts/55 2023-02-21 14:14 (10.9.90.60) zs0806 pts/76 2023-02-21 11:44 (172.21.2.7) dpmoore pts/50 2023-02-21 14:03 (172.20.205.177) jdh4 pts/61 2023-02-21 14:37 (188.8.131.52) ys5910 pts/73 2023-02-21 14:37 (172.21.2.7) cw1074 pts/81 2023-02-21 13:39 (172.20.210.60) $ checkquota Storage/size quota filesystem report for user: ceisgrub Filesystem Mount Used Limit MaxLim Comment Adroit home /home 9.1GB 9.3GB 10GB Adroit scratch /scratch 0 0 0 Adroit scratch network /scratch/network 1.7GB 93GB 100GB Storage number of files used report for user: ceisgrub Filesystem Mount Used Limit MaxLim Comment Adroit home /home 80.5K 975K 1.0M Adroit scratch /scratch 1 0 0 Adroit scratch network /scratch/network 18.9K 9.8M 10.5M For quota increase requests please use this website: https://forms.rc.princeton.edu/quota