Improving Analysis Workflows with Snakemake

Apr 5, 2022, 1:00 pm3:30 pm
Online Event


Event Description
Tired of writing sbatch scripts and complex bash logic for your work?  Does your directory look like 'step_1.slurm, step_2.slurm, step_3.slurm, step_3_final.slurm'?  Have you struggled to replicate previous results because some intermediate steps are lost to your shell history?  Then you are ready to improve your analysis pipelines with a workflow management system! Snakemake is a concise but descriptive framework for specifying workflows that interfaces with HPC systems.  Written in python, complex relationships can be described through python scripting and any command you can run on a terminal can be executed.  In this workshop, you will take a series of sbatch scripts and develop them into a snakemake workflow to create a reproducible, distributable, and efficient analysis pipeline.  Several cookie-cutter examples will be provided to help jumpstart your work.

Learning objectives: Attendees will learn how to convert their workflows to snakemake and get them running with a slurm scheduler.

Knowledge prerequisites: Basic Linux, HPC, and some familiarity with conda.

Hardware/software prerequisites: (1) Bring a laptop which can connect to the eduroam wireless network. You will also need to be able to Duo authenticate to use campus resources. (2) Have an SSH client installed on your laptop. (3) Register for an account on Adroit. This is the cluster we will use for demonstration purposes. Make sure you can SSH to Adroit before the workshop by following this guide.  (4) Create a conda environment and install snakemake

Workshop format: Demonstration and hands-on