Julia on the HPC Clusters

OUTLINE

 

Introduction

Julia is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. One can write code in Julia that is nearly as fast as C. Julia features optional typing, multiple dispatch, and good performance, achieved using type inference and just-in-time (JIT) compilation, implemented using LLVM. It is multi-paradigm, combining features of imperative, functional, and object-oriented programming.

 

Modules

To use Julia you need to load an environment module. For instance, on Adroit:

$ julia
-bash: julia: command not found

$ module avail julia
------------------ /usr/licensed/Modules/modulefiles ------------------
julia/0.4.7 julia/0.5.1 julia/0.7.0 julia/1.0.3 julia/1.2.0 julia/1.4.1
julia/0.5.0 julia/0.6.0 julia/1.0.1 julia/1.1.0 julia/1.3.0 julia/1.5.0

$ module load julia/1.5.0
$ julia
julia>

 

Serial Batch Jobs

Here is a simple Julia script (hello_world.jl):

println("Hello, world.")

Below is the Slurm script:

#!/bin/bash
#SBATCH --job-name=serial_jl     # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=1               # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=4G         # memory per cpu-core (4G is default)
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)
#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-user=<YourNetID>@princeton.edu

module purge
module load julia/1.5.0

julia hello_world.jl

To run the Julia script, simply submit the job to the cluster:

$ sbatch job.slurm

After the job completes, view the output with cat slurm-*:

Hello, world.

Use squeue -u $USER to monitor queued jobs.

To run the example above on Della, for example, carry out these commands:

$ ssh <YourNetID>@della.princeton.edu
$ cd /scratch/gpfs/<YourNetID>
$ git clone https://github.com/PrincetonUniversity/hpc_beginning_workshop
$ cd hpc_beginning_workshop/RC_example_jobs/julia
# edit email address in job.slurm using a text editor
$ sbatch job.slurm

 

Running Parallel Julia Scripts using the Distributed Package

Julia comes with built-in parallel programming support. While many of the parallel packages are still under development, they can be used to achieve a significant speed-up. If your parallel processes are independent then consider using a Slurm job array instead of writing a parallel Julia script.

The example below presents a simple use case of the Distributed package. The Julia script below (hello_world_distributed.jl) illustrates the basics of using spawnat and fetch:

using Distributed

# launch worker processes
num_cores = parse(Int, ENV["SLURM_CPUS_PER_TASK"])
addprocs(num_cores)

println("Number of cores: ", nprocs())
println("Number of workers: ", nworkers())

# each worker gets its id, process id and hostname
for i in workers()
    id, pid, host = fetch(@spawnat i (myid(), getpid(), gethostname()))
    println(id, " " , pid, " ", host)
end

# remove the workers
for i in workers()
    rmprocs(i)
end

Here is the Slurm script:

#!/bin/bash
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=1               # total number of tasks across all nodes
#SBATCH --cpus-per-task=4        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=4G         # memory per cpu-core (4G is default)
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)
#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-user=<YourNetID>@princeton.edu

module purge
module load julia/1.5.0

julia hello_world_distributed.jl

The output should be something like:

Number of cores: 5
Number of workers: 4
2 19945 tiger-i25c1n11
3 19947 tiger-i25c1n11
4 19948 tiger-i25c1n11
5 19949 tiger-i25c1n11

There is much more that can be done with the Distributed package. You may also consider looking at distributed arrays and multithreading on the Julia website.

 

Machine Learning

See the Intro to the Machine Learning Libraries workshop for working with Flux with GPUs and other packages.

 

Using Gurobi

To use Gurobi with packages such as JuMP, the GUROBI_HOME environment variable must be set. This is done by loading one of the Gurobi modules, for instance:

module load gurobi/9.0.1

To see the actual value for GUROBI_HOME, run this command:

$ module show gurobi/9.0.1 2>&1 | grep GUROBI_HOME
setenv GUROBI_HOME /usr/licensed/gurobi/9.0.1/linux64

To start working with JuMP and Gurobi follow these steps:

$ module load gurobi/9.0.1
$ module load julia/1.5.0
$ julia
julia> import Pkg
julia> Pkg.add("Gurobi")
julia> Pkg.add("JuMP")
julia> using JuMP, Gurobi
julia> model = Model(Gurobi.Optimizer)
A JuMP Model
Feasibility problem with:
Variables: 0
Model mode: AUTOMATIC
CachingOptimizer state: EMPTY_OPTIMIZER
Solver name: Gurobi

Be sure to load the julia/1.5.0 and gurobi/9.0.1 modules in your Slurm script.

 

Julia Environments and GPU Jobs

If you are working on multiple Julia projects where each project requires a different set of packages then you should use environments to isolate the packages required for each project. Below is an example of making multiple environments. To make an environment in Julia simply pass a path to the activate command while running the package manager.

First, we make environment 1:

$ module load julia/1.5.0
$ julia
julia>  # start package manager by pressing ] key
(@v1.5) pkg> activate "/home/<YourNetID>/.julia/project1"
(project1) pkg> add DifferentialEquations
(project1) pkg> activate  # leave the environment
(@v1.5) pkg>

Next, create a second environment with the GPU package CUDA:

(@v1.5) pkg> activate "/home/<YourNetID>/.julia/project2"
(project2) pkg> add CUDA
(project2) pkg>  # leave package manager by pressing backspace/delete
julia> exit()

The two environments are independent. That is, CUDA cannot be used alongside DifferentialEquations in project1 and DifferentialEquations cannot be used with CUDA in project 2.

The Julia script for project1 might be this:

using Pkg
Pkg.activate("/home/<YourNetID>/.julia/project1")
Pkg.instantiate()

using DifferentialEquations
println("Success")

Below is a Julia script that could be used with project2:

using Pkg
Pkg.activate("/home/<YourNetID>/.julia/project2")
Pkg.instantiate()

using CUDA, Test

N = 2^20
x_d = CUDA.fill(1.0f0, N)
y_d = CUDA.fill(2.0f0, N)
y_d .+= x_d
@test all(Array(y_d) .== 3.0f0)
println("Success")

For project2 with CUDA the Slurm script would be:

#!/bin/bash
#SBATCH --job-name=myjob         # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=1               # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=4G         # memory per cpu-core (4G per cpu-core is default)
#SBATCH --gres=gpu:1             # number of gpus per node
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)
#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-user=<YourNetID>@princeton.edu

module purge
module load julia/1.5.0 cudatoolkit/11.0 cudnn/cuda-11.0/8.0.2

julia myscript.jl

Submit the job with: sbatch job.slurm

You may encounter warnings when Julia tries to update package registries which are not accessible from the compute nodes. You can do a manual update periodically on the login node like this:

$ julia
julia> using Pkg
julia> Pkg.activate("/home/<YourNetID>/.julia/project2")
julia> Pkg.update()

 

Storing Packages

By default Julia packages are stored in /home/<YourNetID>/.julia/packages. If you want to store your Julia packages on /scratch/gpfs to free space in /home then set this environment variable in your ~/.bashrc file:

export JULIA_DEPOT_PATH=/scratch/gpfs/$USER/myjulia

Keep in mind that /scratch/gpfs is not backed up.

 

Debugging

For GPU programming, to see which libraries are being used, start Julia in this way:

$ JULIA_DEBUG=CUDAapi julia

 

Getting Help

If you encounter any difficulties while running Julia on the HPC clusters then please send an email to cses@princeton.edu or attend a help session.