How Do I Get My Files Onto (or Off) the Cluster?
One of the most frequently asked questions is how to get files to Adroit or any other cluster. We recommend the using the 'scp' command.
For MacOS computers, you can use this command within the Terminal application.
For Windows, clients like PowerShell, PuTTY and Mobaxterm or FTP clients (like WS_FTP–paid, sadly–or Filezilla) are needed. In both cases, make sure you're using interactive mode. Filezilla especially can be a pain with Duo Authentication for Nobel, but we have some tips here.
If you're transferring a lot of files, consider zipping them.
The default client to transfer files is 'scp.' This is a copy command that uses SSH to copy files.
The general structure of the scp command is:
scp [options] [netid]@source-host:file/location [netid@]destination-host:file/location
It's generally more straightforward to transfer files from your personal computer to the clusters, since you don't need to specify the user and host for the system you're already in.
Example - Using scp Command
Note that for the following commands to work from off-campus you must be connected via the campus VPN.
From your local machine (e.g., laptop) to a Princeton cluster
A Princeton student with the netid jessedoe wants to transfer a 'data.csv' file from their personal laptop to the '/scratch/network/jessedoe' directory on the Adroit cluster. To do this, they would use the following command:
scp ~/mydatafiles/data.csv [email protected]:/scratch/network/jessedoe
Transferring a folder requires the -r option, and would therefore look like this:
scp -r ~/mydatafiles [email protected]:/scratch/network/jessedoe
From a Princeton cluster to your local machine (e.g., laptop)
To transfer the file myfile.txt from Adroit to your Deskop:
$ scp [email protected]:/scratch/network/jessedoe/myfile.txt /Users/jessedoe/Desktop
To transfer the project1 directory and all of its contents to the "research" directory on your laptop:
$ scp -r [email protected]:/scratch/network/jessedoe/project1 /Users/jessedoe/research
If you need to transfer particularly large files, you may need to use Globus. We have additional information on Globus on our website, and run a regular workshop on data transfers with Globus.
Your best bet is to use the 'tar' command to create a gzipped or bzipped archive of your home folder before transferring it. (bzip2 takes longer to create an archive than gzip, but it compresses better, so that's the tradeoff.)
If you want to tar your home folder, for example, you need to either create the tarfile somewhere other than your /home/<your-netid> folder, like a /project folder or /scratch folder that you have access to.
We recommend running the 'tar' command from the folder above the one you want to tar up (so /home rather than inside your /home/<your-netid> folder), which helps avoid issues such as matching hidden folders with wildcards.
The general structure of the tar command is:
tar [options] [desired-path-and-name-of-the-tar-file-you-will-create] [folder-to-be-tarred]
In the [options] of the command above, we recommend using the options c, j, v, and f, which do the following:
- c - Create a tar file
- j - use bzip2 compression (use z for gzip compression instead)
- v - verbose. This is optional, but I like to use it to make sure my command is doing what I wanted it to do
- f - filename of the file to create. This must come last, as it expects the next argument to be the filename.
To give an example, if your netid is jessedoe, and you want to tar/bzip2 your /home/<your-netid> folder and write the resulting file to your /scratch/gpfs/<your-netid> folder, you would use the following commands:
cd /home tar cjvf /scratch/gpfs/jessedoe/jessedoe-della.tar.bz2 jessedoe
Then you can copy your jessedoe-della.tar.bz2 file with the 'scp' command to wherever you want.
Learn More About File Transfer Options
Additional information on learning how to transfer data–with tools such as scp, ftp, rsync, and Globus–can be found on Research Computing's Learning Resources: Data Transfer page.
SSH Keys: scp without typing passwords
Typing passwords every time you want to connect to a machine or, more annoying, every time you want to copy a file to/from a remote machine gets annoying quickly. One solution is to enable passwordless login/remote operations by generating a public/private pair of ssh keys and using them to negotiate the connection. The procedure is explained in this guide.