Sharing Data Outside of Princeton

This page describes ways that users of the Princeton HPC clusters can share their data beyond the university. If you are looking to share data with another Princeton HPC user then see Sharing Data with Other Users.

 

Common Terminal Tools

The starting point for transferring data from a Princeton system to elsewhere are the common terminal tools: scp, sftp, rsync and others. These tools are a good choice when transferring individual files up to entire data sets of tens of gigabytes in size. If you are on a high-speed network then you will be able to transfer hundreds of gigabytes is a reasonable amount of time (less than a day).

The commands below provide an example of transferring a single file on tiger to a second account at ORNL:

$ ssh <YourNetID>@tiger.princeton.edu
$ cd /scratch/gpfs/<YourNetID>
$ scp file.dat <Username>@dtn.ccs.ornl.gov:/data/

To learn more about scp and other common tools see this page. Consider compressing your data using gunzip or bzip2 before transferring it.

 

Globus

Globus uses multiple streams to transfer data giving it a significant performance advantage over the common terminal tools and making it the obvious choice for very large transfers. However, to use Globus it must be available at both endpoints. See this page to learn how to use Globus at Princeton.

 

DataSpace

Dataspace offers long-term storage and publication options for datasets, visualizations or reports.

 

Publishing Large Datasets

See this guide by Princeton Research Data Service.

 

tigress-web

See this page to learn how to make files in /tigress or /projects available such that external collaborators can access them using a web browser or the wget command, for instance. Note that there are a few constraints associated with this approach.