A graphical parallel debugger

The following instructions apply to della, stellar, tiger, traverse, adroit, tigressdata and nobel unless otherwise stated. Be sure to connect directly to one of the head nodes as opposed to going through a gateway like tigressgateway.

See the Linaro DDT getting started guide and user manual.

Initial Setup

  • Create a graphical desktop session (recommended) or start an instance of an X server on your personal system then use “ssh -X”.
  • Configure your software environment on the cluster. This typically involves loading environment modules.
  • Build your application as you normally would, but add the -g option to the mpicc, mpif90, icc, ifort, gcc (etc) command line so as to enable source-level debugging. Note that this will turn off compiler optimization, and your code will run slower.
  • Note: You may need to explicitly set the optimization level to -O0 in order to ensure that the correct line is highlighted in DDT when breaking or single-stepping. Optimization can change the order of the lines being executed and thus cause the wrong line to be highlighted.

Example of Debugging a Simple Open MPI Code

The latest version of DDT can be made available with:

module load ddt/24.0

Here we illustrate the main steps for Stellar. Connect to MyStellar (VPN required if off-campus) and choose "Interactive Apps" then "Desktop of Stellar Vis node". When the session starts, click on the black terminal icon next to FireFox to launch a terminal. In the terminal, run the commands shown in the figure below:

DDT 22.0 Setup

After DDT launches, choose "RUN". Then enter the settings below:

DDT 22.0 Settings

Finally, click on the "Run" button at the bottom of the screen. This will launch a Slurm job on a compute node with 4 processes.

The DDT debugging interface is shown in the image below. You can step into, step over, set breakpoints, etc. You can also jump between the different processes and inspect the local variables by clicking on the process id (shown as pink squares).

DDT 22.0 Session

Standard DDT Setup

  • Load the module:  'module load ddt/24.0'
  • Launch the program:  'ddt'
  • The DDT window should now be displayed.
  • If this is the first time you're running DDT on a given system, DDT will be configured with default values for that system. Change them as needed, as described below.
  • In the opening 'Linaro DDT' page, click on 'Run'.
  • Select your Application, Arguments, Input File,and Working Directory.
  • If running on other than Nobel or Tigressdata, select "Submit to Queue", and click on Parameters.
    • Choose a Wall Clock Limit.
    • Do not change the queue (if present) unless you have been given access to a special queue.
    • If you wish to have an Email notification at the beginning (begin) or end (end) of a job, or when the job aborts (fail), change the default to suit your preference (e.g., all).
    • There is no need to change the Email address unless you do not have a Princeton address. If you don't, please specify your email address here.
    • Click OK.
    • If you find that the job hangs with the message "Waiting for job XXXXXXX to start" (even while it is clearly running) then try using an interactive session with "salloc" (see directions below).
  • If this is an MPI parallel application, then select the MPI checkbox.
    • Specify the Number of Nodes, and Processes per Node.
  • If this is an OpenMP application, then select the OpenMP checkbox.
    • Specify the number of threads.
    • If using a hybrid MPI/OpenMP application, this represents the number of threads per MPI process.
    • Important: For all OpenMP jobs on all clusters, an additional configuration change is required. Click on the Options button (near the bottom left). This will open another window. Select "Job Submission" from the left hand menu. Then change the "Submission template file:" field to "/usr/licensed/ddt/templates/slurm-openmp.qtf".
  • Set other parameters as desired.
  • On Nobel, click on Run; otherwise, click on Submit.

On Nobel, the job will start immediately.

On clusters other than Nobel, the debugging job will be submitted to the SLURM batch queue, and the debugging session will automatically start when the job begins to run. Once the program has run to completion, you can restart the session without having to requeue the job in the SLURM queue. But if you need to change any options or parameters, you will need to end the current session and start another, requiring a new job to be queued.

Be sure to remove any "module load" commands in your .bashrc or .cshrc file when you are done debugging.Standard DDT Setup

  • Load the module:  'module load ddt/24.0'
  • Launch the program:  'ddt'
  • The DDT window should now be displayed.
  • If this is the first time you're running DDT on a given system, DDT will be configured with default values for that system. Change them as needed, as described below.
  • In the opening 'Linaro DDT' page, click on 'Run'.
  • Select your Application, Arguments, Input File,and Working Directory.
  • If running on other than Nobel or Tigressdata, select "Submit to Queue", and click on Parameters.
    • Choose a Wall Clock Limit.
    • Do not change the queue (if present) unless you have been given access to a special queue.
    • If you wish to have an Email notification at the beginning (begin) or end (end) of a job, or when the job aborts (fail), change the default to suit your preference (e.g., all).
    • There is no need to change the Email address unless you do not have a Princeton address. If you don't, please specify your email address here.
    • Click OK.
    • If you find that the job hangs with the message "Waiting for job XXXXXXX to start" (even while it is clearly running) then try using an interactive session with "salloc" (see directions below).
  • If this is an MPI parallel application, then select the MPI checkbox.
    • Specify the Number of Nodes, and Processes per Node.
  • If this is an OpenMP application, then select the OpenMP checkbox.
    • Specify the number of threads.
    • If using a hybrid MPI/OpenMP application, this represents the number of threads per MPI process.
    • Important: For all OpenMP jobs on all clusters, an additional configuration change is required. Click on the Options button (near the bottom left). This will open another window. Select "Job Submission" from the left hand menu. Then change the "Submission template file:" field to "/usr/licensed/ddt/templates/slurm-openmp.qtf".
  • Set other parameters as desired.
  • On Nobel, click on Run; otherwise, click on Submit.

On Nobel, the job will start immediately.

On clusters other than Nobel, the debugging job will be submitted to the SLURM batch queue, and the debugging session will automatically start when the job begins to run. Once the program has run to completion, you can restart the session without having to requeue the job in the SLURM queue. But if you need to change any options or parameters, you will need to end the current session and start another, requiring a new job to be queued.

Be sure to remove any "module load" commands in your .bashrc or .cshrc file when you are done debugging.

Running in offline mode

One can run DDT in a non-interactive mode or offline mode which is useful for memory debugging and tracepoints. Below is sample Slurm script for doing this on Traverse:

#!/bin/bash
#SBATCH -N 2
#SBATCH -t 1:00:00
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-socket=2
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=map_gpu:0,1,2,3
#SBATCH --cpus-per-task=32
#SBATCH --reservation=test
module load pgi/19.9/64
module load openmpi/pgi-19.9/4.0.4/64
module load cudatoolkit/10.1
module load hdf5/pgi-19.9/openmpi-4.0.4/1.10.6
module load fftw/gcc/openmpi-4.0.1/3.3.8
module load ddt/24.0
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export SLURM_OVERLAP=1
export SLURM_GPUS_PER_TASK=0
ddt --offline --output=output.html srun --gpus-per-task=1 ./a.out

 

Re-Using a Slurm Interactive Session for Multiple DDT Runs (does not apply to Nobel)

This approach is useful when the Slurm batch job queue times are long, and allows a user to start one interactive Slurm session, and then re-use it for multiple DDT sessions, as long as the number of resources needed do not exceed what was requested for the interactive Slurm session.

  • Type salloc --nodes=X --ntasks-per-node=Y --time=HH:MM:SS --x11
  • Wait for the interactive session prompt (on a compute node).
  • Run “module load ddt/24.0” and then lannch ddt by running the command “ddt”.
  • The DDT window should now be displayed by your X server.
  • If this is the first time you're running DDT on a given system, DDT will be configured with default values for that system. Change them as needed.
  • In the opening 'Linaro DDT' page, click on 'Run'.
  • Select your Application, Arguments, Input File, and Working Directory.
  • Deselect "Submit to Queue".
  • If this is an MPI parallel application, then select the MPI checkbox.
    • Set "Number of processes" to X*Y
  • Set other parameters as desired.
  • Click Run.

Session will start immediately, and the processes will be distributed per the resource specifications from the salloc command.

Debugging C/C++/Fortran shared objects called from R (and other scripting languages)

The following instructions apply to della, tiger, tukey , adroit, tigressdata, and nobel. If you need to debug on another cluster, please send a request to [email protected].

DDT allows the debugging C, C++, or Fortran shared objects called from R (or Python or other scripting languages) by supporting pending breakpoint which will be resolved when the shared library is loaded by the script. Instructions for debugging C code under R follow:

  • Rebuild your C shared object (*.so file) with the compiler of your choice using its -g command line option to support source-level debugging with DDT:
    • Create/edit ~/.R/Makevars
      • To use GCC compiler, set:
        CFLAGS=-g
        LIBR=-g
      • To use Intel compilers, load the necessary intel module and set:
        CC=icc -gcc
        CXX=icpc -gcc
        CFLAGS=-g
        SHLIB_LD=icc
        LIBR=-g
    • NOTE: set other compiler and linker options as desired on the CFLAGS and LIBR lines.
    • Type: R CMD SHLIB <source_file_path/name>
  • Start up DDT as usual, and click on "Run and Debug a Program".
  • For "Application", enter "/usr/lib64/R/bin/exec/R". This is the actual R executable that gets called by the /usr/bin/R script. DDT needs the real executable file, not a script.
  • For "Arguments", enter "-f<your_R_script_path_and_name>".
  • The various checkboxes should be clear, on the assumption that MPI is not being used. But if you are planning to debug a long-running C, C++ or Fortran code on a cluster, select "Submit to Queue".
  • Click on "Environment Variables" and enter:
    R_HOME=/usr/lib64/R
    DDT_ENTRY_POINT=main
  • Click "Run" or "Submit".
  • In the main DDT window, right-click anywhere in the left-hand "Project Files" panel, and select "Add file". Browse to the directory containing the C source file you want to debug, select the desired source file, and click "Open". (This is only needed the first time you debug this source file.)
  • In the "Project Files" panel's Application Code section, click on the '+' next to "Source Files" to display your source file name.
  • Double-click on your source file name to open it. Ignore any warning or dialog box regarding the source file being newer than the executable.
  • Click in the left margin of any line in your C source code to set a breakpoint. You will see a dialog box stating that the breakpoint will only be activated when your shared object is loaded. Click "Yes".
  • Click on "Play", and the code will run until it reaches your breakpoint. You can now do DDT debugging as usual.

Debugging C++ shared objects called from R using the Rcpp package

DDT supports the debugging of C++ shared objects called from R using the Rcpp package. Note that you can control the build options by using ~/.R/Makevars as explained above. Instructions are as follows:

  • Create a tmp directory under your home directory (e.g., mkdir ${HOME}/tmp).
  • Start up DDT as usual, and click on "Run and Debug a Program".
  • For "Application", enter "/usr/lib64/R/bin/exec/R". This is the actual R executable that gets called by the /usr/bin/R script. DDT needs the real executable file, not a script.
  • For "Arguments", enter "-f<your_R_script_path_and_name>".
  • The various checkboxes should be clear, on the assumption that MPI is not being used.
  • Click on "Environment Variables" and enter:
    R_HOME=/usr/lib64/R
    DDT_ENTRY_POINT=main
    TMPDIR=${HOME}/tmp
  • Click "Run".
  • Click on the lower-left breakpoint tab.
  • Right-click anywhere in that panel, and click on Add Breakpoint.
  • In the dialog box that comes up, click on Function, type in 'dlopen' (without the quotes), and click on "Add".
  • Click "Play".
  • Each time the dlopen breakpoint is hit, look in the Current Line(s) panel on the right to see the name of the SO file that is about to be dynamically loaded.
    • If it is not some variant of "sourceCpp" (e.g., "sourceCpp_4809.so"), then continue.
    • Otherwise pause.
  • Click on "Step Out (F6)". This will load your *.cpp code.
  • In the "Project Files" panel's External Code section, click on the '+' next to "Source Files" to display your source file name.
  • Double-click on your source file name to open it. Ignore any warning or dialog box regarding the source file being newer than the executable.
  • Click in the left margin of any line in your C source code to set a breakpoint. You will see a dialog box stating that the breakpoint will only be activated when your shared object is loaded. Click "Yes".
  • Click on "Play", and the code will run until it reaches your breakpoint. You can now do DDT debugging as usual.

Memory Debugging: Problems using Open MPI

When DDT Memory Debugging is enabled, you may sometimes see the following warnings when debugging an OpenMPI application:

mpirun: --------------------------------------------------------------------------

mpirun: A process attempted to use the "leave pinned" MPI feature, but no

mpirun: memory registration hooks were found on the system at run time. This

mpirun: may be the result of running on a system that does not support memory

mpirun: hooks or having some other software subvert Open MPI's use of the

mpirun: memory hooks. You can disable Open MPI's use of memory hooks by

mpirun: setting both the mpi_leave_pinned and mpi_leave_pinned_pipeline MCA

mpirun: parameters to 0.

mpirun:

mpirun: Open MPI will disable any transports that are attempting to use the

mpirun: leave pinned functionality; your job may still run, but may fall back

mpirun: to a slower network transport (such as TCP).

mpirun: --------------------------------------------------------------------------

In order to avoid this warning and still allow Open MPI to use the InfiniBand fabric, you will need to add the following parameters to the mpirun arguments field of DDT's Run dialog box:

--mca mpi_leave_pinned 0 --mca mpi_leave_pinned_pipeline 0