Debugging with DDT on the clusters

The following instructions apply to della, stellar, tiger, traverse, adroit, tigressdata and nobel unless otherwise stated. Be sure to connect directly to one of the head nodes as opposed to going through a gateway like tigressgateway.

See the ARM DDT getting started guide and user manual.

 

Initial Setup

  • Start an instance of an X server on your personal system.
  • ssh into the system's head node with X11-forwarding enabled. Depending on your settings, you may need to invoke the ssh client with "ssh -X".
  • If not running on Nobel, then on the system's head node, in your .cshrc or .bashrc file:
    • Set up your module environment:
      • tiger: module load intel/19.0/64/19.0.5.281 intel-mpi/intel/2019.5/64 cudatoolkit/11.0 (the latter if running GPU codes)
      • other clusters: module load intel/19.0/64/19.0.5.281 intel-mpi/intel/2019.5/64
    • Log out and log back in for it to take effect.
  • Build your application as you normally would, but add the -g option to the mpicc, mpif90, icc, ifort, gcc (etc)command line so as to enable source-level debugging. Note that this will turn off compiler optimization, and your code will run slower.
  • Note: You may need to explicitly set the optimization level to -O0 in order to ensure that the correct line is highlighted in DDT when breaking or single-stepping. Optimization can change the order of the lines being executed and thus cause the wrong line to be highlighted.

 

Standard DDT Setup

  • Load the module:  'module load ddt/20.0.1'
  • Launch the program:  'ddt'
  • The DDT window should now be displayed by your X server.
  • If this is the first time you're running DDT on a given system, DDT will be configured with default values for that system. Change them as needed, as described below.
  • In the opening 'Allinea DDT' page, click on 'Run'.
  • Select your Application, Arguments, Input File,and Working Directory.
  • If running on other than Nobel or Tigressdata, select "Submit to Queue", and click on Parameters.
    • Choose a Wall Clock Limit.
    • Do not change the queue (if present) unless you have been given access to a special queue.
    • If you wish to have an Email notification at the beginning (begin) or end (end) of a job, or when the job aborts (fail), change the default to suit your preference (e.g., all).
    • There is no need to change the Email address unless you do not have a Princeton address. If you don't, please specify your email address here.
    • Click OK.
    • If you find that the job hangs with the message "Waiting for job XXXXXXX to start" (even while it is clearly running) then try using an interactive session with "salloc" (see directions below).
  • If this is an MPI parallel application, then select the MPI checkbox.
    • Specify the Number of Nodes, and Processes per Node.
  • If this is an OpenMP application, then select the OpenMP checkbox.
    • Specify the number of threads.
    • If using a hybrid MPI/OpenMP application, this represents the number of threads per MPI process.
    • Important: For all OpenMP jobs on all clusters, an additional configuration change is required. Click on the Options button (near the bottom left). This will open another window. Select "Job Submission" from the left hand menu. Then change the "Submission template file:" field to "/usr/licensed/ddt/templates/slurm-openmp.qtf".
  • Set other parameters as desired.
  • On Nobel, click on Run; otherwise, click on Submit.

On Nobel, the job will start immediately.

On clusters other than Nobel, the debugging job will be submitted to the SLURM batch queue, and the debugging session will automatically start when the job begins to run. Once the program has run to completion, you can restart the session without having to requeue the job in the SLURM queue. But if you need to change any options or parameters, you will need to end the current session and start another, requiring a new job to be queued.

Be sure to remove any "module load" commands in your .bashrc or .cshrc file when you are done debugging.

 

Running in offline mode

One can run DDT in a non-interactive mode or offline mode which is useful for memory debugging and tracepoints. Below is sample Slurm script for doing this on Traverse:

#!/bin/bash
#SBATCH -N 2
#SBATCH -t 1:00:00
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-socket=2
#SBATCH --gpus-per-task=1
#SBATCH --gpu-bind=map_gpu:0,1,2,3
#SBATCH --cpus-per-task=32
#SBATCH --reservation=test

module load pgi/19.9/64
module load openmpi/pgi-19.9/4.0.4/64
module load cudatoolkit/10.1
module load hdf5/pgi-19.9/openmpi-4.0.4/1.10.6
module load fftw/gcc/openmpi-4.0.1/3.3.8
module load ddt/20.0.1

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export SLURM_OVERLAP=1
export SLURM_GPUS_PER_TASK=0

ddt --offline --output=output.html srun --gpus-per-task=1 ./a.out

 

 

Re-Using a SLURM Interactive Session for Multiple DDT Runs (does not apply to Nobel)

This approach is useful when the SLURM batch job queue times are long, and allows a user to start one interactive SLURM session, and then re-use it for multiple DDT sessions, as long as the number of resources needed do not exceed what was requested for the interactive SLURM session.

  • Type salloc --nodes=X --ntasks-per-node=Y --time=HH:MM:SS --x11
  • Wait for the interactive session prompt (on a compute node).
  • Type '/usr/licensed/bin/ddt'.
  • The DDT window should now be displayed by your X server.
  • If this is the first time you're running DDT on a given system, DDT will be configured with default values for that system. Change them as needed.
  • In the opening 'Allinea DDT' page, click on 'Run'.
  • Select your Application, Arguments, Input File,and Working Directory.
  • Deselect "Submit to Queue".
  • If this is an MPI parallel application, then select the MPI checkbox.
    • Set "Number of processes" to X*Y
  • Set other parameters as desired.
  • Click Run.

Session will start immediately, and the processes will be distributed per the resource specifications from the salloc command.

 

Debugging C/C++/Fortran shared objects called from R (and other scripting languages)

The following instructions apply to della, tiger, tukey , adroit, tigressdata, and nobel. If you need to debug on another cluster, please send a request to [email protected].

DDT allows the debugging C, C++, or Fortran shared objects called from R (or Python or other scripting languages) by supporting pending breakpoint which will be resolved when the shared library is loaded by the script. Instructions for debugging C code under R follow:

  • Rebuild your C shared object (*.so file) with the compiler of your choice using its -g command line option to support source-level debugging with DDT:
    • Create/edit ~/.R/Makevars
      • To use GCC compiler, set:
        CFLAGS=-g
        LIBR=-g
      • To use Intel compilers, load the necessary intel module and set:
        CC=icc -gcc
        CXX=icpc -gcc
        CFLAGS=-g
        SHLIB_LD=icc
        LIBR=-g
    • NOTE: set other compiler and linker options as desired on the CFLAGS and LIBR lines.
    • Type: R CMD SHLIB <source_file_path/name>
  • Start up DDT as usual, and click on "Run and Debug a Program".
  • For "Application", enter "/usr/lib64/R/bin/exec/R". This is the actual R executable that gets called by the /usr/bin/R script. DDT needs the real executable file, not a script.
  • For "Arguments", enter "-f<your_R_script_path_and_name>".
  • The various checkboxes should be clear, on the assumption that MPI is not being used. But if you are planning to debug a long-running C, C++ or Fortran code on a cluster, select "Submit to Queue".
  • Click on "Environment Variables" and enter:
    R_HOME=/usr/lib64/R
    DDT_ENTRY_POINT=main
  • Click "Run" or "Submit".
  • In the main DDT window, right-click anywhere in the left-hand "Project Files" panel, and select "Add file". Browse to the directory containing the C source file you want to debug, select the desired source file, and click "Open". (This is only needed the first time you debug this source file.)
  • In the "Project Files" panel's Application Code section, click on the '+' next to "Source Files" to display your source file name.
  • Double-click on your source file name to open it. Ignore any warning or dialog box regarding the source file being newer than the executable.
  • Click in the left margin of any line in your C source code to set a breakpoint. You will see a dialog box stating that the breakpoint will only be activated when your shared object is loaded. Click "Yes".
  • Click on "Play", and the code will run until it reaches your breakpoint. You can now do DDT debugging as usual.

 

Debugging C++ shared objects called from R using the Rcpp package

DDT supports the debugging of C++ shared objects called from R using the Rcpp package. Note that you can control the build options by using ~/.R/Makevars as explained above. Instructions are as follows:

  • Create a tmp directory under your home directory (e.g., mkdir ${HOME}/tmp).
  • Start up DDT as usual, and click on "Run and Debug a Program".
  • For "Application", enter "/usr/lib64/R/bin/exec/R". This is the actual R executable that gets called by the /usr/bin/R script. DDT needs the real executable file, not a script.
  • For "Arguments", enter "-f<your_R_script_path_and_name>".
  • The various checkboxes should be clear, on the assumption that MPI is not being used.
  • Click on "Environment Variables" and enter:
    R_HOME=/usr/lib64/R
    DDT_ENTRY_POINT=main
    TMPDIR=${HOME}/tmp
  • Click "Run".
  • Click on the lower-left breakpoint tab.
  • Right-click anywhere in that panel, and click on Add Breakpoint.
  • In the dialog box that comes up, click on Function, type in 'dlopen' (without the quotes), and click on "Add".
  • Click "Play".
  • Each time the dlopen breakpoint is hit, look in the Current Line(s) panel on the right to see the name of the SO file that is about to be dynamically loaded.
    • If it is not some variant of "sourceCpp" (e.g., "sourceCpp_4809.so"), then continue.
    • Otherwise pause.
  • Click on "Step Out (F6)". This will load your *.cpp code.
  • In the "Project Files" panel's External Code section, click on the '+' next to "Source Files" to display your source file name.
  • Double-click on your source file name to open it. Ignore any warning or dialog box regarding the source file being newer than the executable.
  • Click in the left margin of any line in your C source code to set a breakpoint. You will see a dialog box stating that the breakpoint will only be activated when your shared object is loaded. Click "Yes".
  • Click on "Play", and the code will run until it reaches your breakpoint. You can now do DDT debugging as usual.

 

Memory Debugging: Problems using OpenMPI

When DDT Memory Debugging is enabled, you may sometimes see the following warnings when debugging an OpenMPI application:

mpirun: --------------------------------------------------------------------------
mpirun: A process attempted to use the "leave pinned" MPI feature, but no
mpirun: memory registration hooks were found on the system at run time. This
mpirun: may be the result of running on a system that does not support memory
mpirun: hooks or having some other software subvert Open MPI's use of the
mpirun: memory hooks. You can disable Open MPI's use of memory hooks by
mpirun: setting both the mpi_leave_pinned and mpi_leave_pinned_pipeline MCA
mpirun: parameters to 0.
mpirun:
mpirun: Open MPI will disable any transports that are attempting to use the
mpirun: leave pinned functionality; your job may still run, but may fall back
mpirun: to a slower network transport (such as TCP).
mpirun: --------------------------------------------------------------------------

In order to avoid this warning and still allow OpenMPI to usethe InfiniBand fabric, you will need to add the following parameters to thempirun arguments field of DDT's Run dialog box:

--mca mpi_leave_pinned 0 --mca mpi_leave_pinned_pipeline 0