Data Transfer with Globus

Globus at Princeton

Globus is, among other things, an infrastructure for transferring large amounts of research data. Research Computing supports Globus data transfer to and from the GPFS-based /tigress, /projects and /scratch/gpfs filesystems connected to the Research Computing clusters.

Research Computing has multiple Globus endpoints. An endpoint is known as the "Collection" in the Globus app:

Princeton Della /scratch/gpfs

Use this endpoint for the /scratch/gpfs filesystem of Della (replace "aturing" with your NetID):

Globus Endpoint

Princeton Tigress

The "Princeton Tigress" data transfer node is connected to the Campus Data Network with 10 Gb/s Ethernet, and is connected to the Tigress facility's GPFS storage cluster with FDR (54Gb/s) Infiniband. Use this endpoint for /tigress, /projects and the /scratch/gpfs filesystem of Tiger as shown below (replace "aturing" with your NetID):

Globus Endpoint
Globus Endpoint
Globus Endpoint

Princeton Traverse/Stellar Scratch DTN

Use this endpoint for the the shared /scratch/gpfs filesystem of Traverse and Stellar (replace "aturing" with your NetID):

Globus Endpoint

For the Traverse and Stellar clusters, there is a separate Globus endpoint, which is named as “Princeton Traverse/Stellar Scratch DTN”. This Globus endpoint has access to the /scratch/gpfs file system which is shared between Traverse and Stellar.

Some departments at Princeton University have and manage their own Globus data transfer node along with their clusters. Please contact your department staff for more information in case your department has one, or if you think your department should have one.

Note that Research Computing’s Globus Endpoint does not support the Google Drive connector.

Globus Share

One cannot make a Globus share to share data stored on, for example, /tigress with a collaborator. Trying to do so will result in "Details: 500 Sharing is not allowed for the current user." For a collaborator to get access to data on /tigress they would need to have an RCU account setup and an account created on one of the clusters.

Transferring Data Between Globus Connect Personal Endpoints

You must be a Globus Plus member to transfer data between personal endpoints. There is no cost for becoming a Globus Plus member. Browse to globus.org and choose "Settings" then "Globus Plus".  Use "Princeton University Research Computing" as the sponsor when making the request.

 

When Should You Use Globus for Data Transfer? 

For transferring small amounts of data, the scp, sftp, and rsync over ssh utilities generally work well and are more widely available. However, if you transfer large data that takes a long time (more than 15 minutes) and happens frequently, or is using an unreliable connection, Globus is recommended. This is because Globus not only transfers data faster but also takes care of disruptions gracefully (e.g., automatically resumes data transfer after temporary network disconnections). 

To use Globus, it needs to be available on both source and destination endpoints. You can make your own machine a Globus endpoint by installing the Globus Connect Personal software. For shared Linux servers, consider the Globus Connect Server software instead (for sysadmins).

 

Who Can Use Research Computing's Globus Service and How?

Primary access to the Globus data transfer service is through the web interface at Globus.org. Please use your Princeton NetID and password for authentication. The Globus organization provides a series of "how to" documents including a getting started guide that covers logging in and transferring files. Research Computing has a locally written document describing How to Log in to Globus with a Princeton NetID.

You can link multiple credentials. Long-time Globus users may wish to do this, as it will allow you to use a single set of credentials (e.g. your Princeton NetID and password), rather than needing to login with multiple accounts. Research computing has a locally written document about How to link a New Identity to an Existing Globus Account, which assumes that you are currently logged in to Globus with credentials other than your Princeton NetID.

Anyone with a Princeton NetID can login to Globus.org. However, accessing and using a particular Globus endpoint depends on the server and file system behind it. For using the Princeton Tigress DTN for Globus transfers, only the users of Tigress systems can use it, excluding the Nobel and Adroit users. For using the Princeton Traverse Scratch DTN, users who have an account on the Traverse cluster can use it.


Publishing Large Datasets

For publishing large datasets, visit the Princeton Research Data Service website.  

 

For Additional Questions and Support

Please consult the How to Get Help page.