Running Stata on the HPC Clusters

Run Stata in Your Web Browser

If you are new to high-performance computing then you will find that the simplest way to use Stata on the HPC clusters is through the Open OnDemand web interface. If you have an account on Adroit or Della then browse to https://myadroit.princeton.edu or https://mydella.princeton.edu. If you need an account on Adroit then complete this form. Note that you will need to use a VPN to connect from off-campus.

To begin a session, click on "Interactive Apps" and then "XStata". You will need to choose the "Stata version", "Number of hours" and "Number of cores". Set "Number of cores" to 1 unless you are sure that your script has been explicitly parallelized using, for example, the Parallel Computing Toolbox (see below). Click "Launch" and then when your session is ready click "Launch XStata". Note that the more resources you request, the more you will have to wait for your session to become available.

Running Stata on Nobel

$ ssh <YourNetID>nobel.princeton.edu
$ module load stata/16.0
$ stata 
Mac users will need to have XQuartz installed while Windows users should install MobaXterm (Home Edition). Visit the the OIT Tech Clinic for assistance with installing, configuring and using these tools. For work with xStata:
$ ssh -X <YourNetID>nobel.princeton.edu
$ module load stata/16.0
$ stata 

Submitting Batch Jobs to the Slurm Scheduler

Stata can be run on the HPC clusters, namely, Adroit and Della. These clusters use a job scheduler and all work must be submitted as a batch job. Intermediate and advanced Stata users prefer submitting jobs to the Slurm scheduler over using the web interface (described above). A job consists of a MATLAB script and a Slurm script that specifies the needed resources and the commands to be run.

Running a Serial Stata Job

A serial Stata job is one that requires only a single CPU-core. Here is an example of a trivial, one-line serial Stata script (hello_world.do):

disp 21+21

The Slurm script (job.slurm) below can be used for serial jobs:

#!/bin/bash
#SBATCH --job-name=stata         # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=1               # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=4G         # memory per cpu-core (4G per cpu-core is default)
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)
#SBATCH --mail-type=all          # send email on job start, end and fault
#SBATCH --mail-user=<YourNetID>@princeton.edu

module purge
module load stata/16.0

stata -b hello_world.do

To run the Stata script, simply submit the job to the cluster with the following command:

 

$ sbatch job.slurm

 

After the job completes, view the output with cat hello_world.log:

 

 /__    /   ____/   /   ____/
___/   /   /___/   /   /___/   16.0   Copyright 1985-2019 StataCorp LLC
  Statistics/Data Analysis            StataCorp
                                      4905 Lakeway Drive
                                      College Station, Texas 77845 USA
                                      800-STATA-PC        http://www.stata.com
                                      979-696-4600        stata@stata.com
                                      979-696-4601 (fax)

100-user Stata network perpetual license:
       Serial number:  401606267559
         Licensed to:  Stata/SE 16
                       100-user Network

Notes:
      1.  Stata is running in batch mode.
      2.  Unicode is supported; see help unicode_advice.

. do "hello_world.do" 

. display 21+21
42

. 
end of do-file

 

Use squeue -u $USER to monitor the progress of queued jobs.