Author: Gayle Gaddis
Held at Princeton on July 10–14 — for the first time anywhere — the INTERSECT Bootcamp aimed at narrowing the gap between computer science and research sciences — paving the way for a new community of research software engineers.
Funded by a grant from the National Science Foundation (NSF), through the Princeton Institute for Computational Science and Engineering (PICSciE), the event brought together 35 computational researchers, from widely diverse backgrounds, to advance their understanding of software engineering techniques and best practices. The organizers’ hope was that it would propel research forward, but also inspire participants to spread the word, and the knowledge. Based on the overwhelming feedback, goal achieved.
Creating an industry where virtually none exists
What do you get when you cross a theologian with a computer scientist? It’s not a riddle, and you could as easily substitute physicist, social scientist or climatologist for theologian. The answer is actually a new genre of researcher, the research software engineer (RSE) — and the only riddle is how to become one. Princeton has long been solving the universe’s riddles — from the human genome to atomic fission to continental drift — so the University, through PICSciE, was a natural host to this revolution in research computing.
Why a “revolution?” Today, there is no formal education in research software engineering. It simply doesn’t exist. What’s more, while organizations like the United States Research Software Engineer Association (US-RSE) have emerged to foster an RSE community, the challenge of educating new RSEs remains. In fact, the need grows daily, with computer models expanding researchers’ grasp of everything from the physical world to the political landscape by astounding degrees. As Dr. Jeff Carver, INTERSECT co-founder and professor of Computer Science at the University of Alabama, expressed it, “raising awareness of the need for training is not as big a deal as raising awareness that it exists.”
To be clear, not everyone doing the work intends to become an RSE as a career, but even for those who simply want to optimize their research methodology, INTERSECT has offered up transformative lessons.
How INTERSECT came together
The challenge of research software engineering, essentially, is that experts in a discipline frequently have no formal training in coding, while programmers often have little experience in a specific discipline. It was a Venn diagram with no center until Dr. Carver met Ian Cosden, senior director of Research Software Engineering at Princeton. As founding members of the US-RSE — one of only a handful of organizations dedicated to the field today — they came together over the lack of training for RSEs, and INTERSECT was born.
When putting out a call for applications, the workshop was described as a means to “help research software developers improve the quality, reproducibility and sustainability of their software.” That was enough to inspire researchers to apply in droves — resulting in far more applicants than there were spots.
A learning experience like no other
“I learned more in these five days than I did in the past two years.” That was how Forrest Brown, a data scientist working with machine learning at Sandia National Laboratories, summed it up. He was not alone:
“This is the single most influential program I have attended in grad school, maybe even my whole college career.”
“This bootcamp has been, hands down, the best experience in my self-taught programming journey by far.”
“I cannot think of how I would have gotten exposure to all these important concepts otherwise.”
Virtually all of the participants, the majority of them grad students, came into the workshop entirely self-taught. They were asked to have mastered the fundamentals before arriving, but none had in-depth training. Each left, as the feedback showed, feeling like the event had changed their relationship to their research. “What really surprised me was just how much it’s possible to do within the world of research software engineering,” wrote Jana Perkins, a computational social scientist and cultural analyst from the University of Illinois at Urbana-Champaign. “It’s opened up several different career paths for me.”
Over 4-1/2 days, the bootcamp covered topics from design, packaging, licensing and distribution to documentation, project management and — the highlight for many — testing. Using the principles of Continuous Integration/Continuous Deployment (CI/CD), it taught the methods and importance of testing code continually as you’re writing it. “I would spend days tackling one little problem,’” said Zach Butler, a theology grad student using machine learning to extract data from 4th-century biblical texts. “Taking a little more time on the front end for best practices…helped shave off days, maybe even weeks, from what I was doing.”
Diversity on full display
Another highlight, for all involved, was the incredible diversity of the participants. The organizers had insisted that diversity be built into the workshop framework — but what they found surprised even them. Attendees came from 28 different institutions in addition to Princeton: universities like Johns Hopkins and Northwestern; labs like Sandia and Oak Ridge; and unexpected sources like the New Orleans Baptist Theological Seminary and the American Academy of Dermatology. Their disciplines were equally varied, from physics and biology to public policy, cosmology and sociology, from electric vehicles to art history. “These were researchers from many different domains, said co-founder Ian Cosden, “but the common ground was what we were teaching.”
Attendees soon saw the value of “meeting people from other fields and just getting out of my own perspective,” as Zach Butler put it. He was inspired to see “what types of data they’re working with, what questions they’re asking of their data, and how code is helping them answer those questions.”
Building community and coding it forward
INTERSECT is an acronym for “INnovation Training Enabled by a Research Software Engineering Community of Trainers — “community” being one of the key objectives. As Ian Cosden put it, “the stated goal was to help grow this community by hitting people earlier in their careers, exposing them to these best practices, and driving people together.”
The strategy worked. Surprisingly for such a varied group, participants found commonalities and opportunities to collaborate easily, as Ian paraphrased: “they’re going to keep talking, because it turns out that this biology problem is not that dissimilar from this material science problem.”
Once planted, the organizers also cultivated the seeds of community in several other ways, for instance: “I feel very gratified that they have a Slack channel where we can exchange questions and messages,” said Valeri Vasquez, who recently finished her doctorate at UC Berkeley’s College of Natural Resources, and is pursuing technologies for public health management under climate change. “It’s a community that we didn’t have access to otherwise.”
Having inspired that community, the INTERSECT founders’ hope is that it will continue to grow, through sharing, and even teaching, by the very participants who were just taught themselves. Valeri Vasquez, for one, is already thinking of how to manage a small team of grad students and postdocs. “There was a section on project management, on collaboration…I can draw from not just the materials of the workshop, but the way it was structured and presented; that then allows the students to grow on their own.”
The workshop is over for this year, but the conversations continue, the resources are still readily available, and the Slack channel is still alive — and with it, the kernels of revolutionary new research. The final word on INTERSECT ’23 came from Forrest Brown: “It was really well thought out. My only criticism was the weather.”