…and how Genomics Aotearoa is helping researchers be a little less scared of the “command line”
By Alana Alexander
30 January 2019
A “random walk” is a mathsy term for a path strung together with a bunch of random steps (not unlike trying to walk my stubborn St. Bernard cross on a leash).
Despite being a bioinformatician (which folks often think means “also a maths wiz”), my maths is (unfortunately) not great. Therefore, I mention random walks not because I’m about to drop a knowledge bomb about higher-dimensional Riemannian manifolds (thanks Wikipedia!), but because I think it is a great way to describe how I got into bioinformatics.
Many researchers assume bioinformaticians have trained extensively to commune with processors, compilers, SLURM scripts, and core dumps. That’s definitely one way to become a bioinformatician, and probably a really smart one! Multiple universities in Aotearoa and internationally offer bioinformatics specialisations and/or degrees, and many of the other (brilliant) Genomics Aotearoa postdocs have formal training in computer science and/or bioinformatics.
But…I’m going to tell you about the alternate career path I took, to show you it is never too late to come over to the dark side and become a bioinformatician.
As a kid, “Little Lanie” was obsessed with rock pools/sharks/whales, and was going to become a marine biologist. I enrolled in a BSc in Biology at the University of Auckland and my descent to the dark side began innocuously enough; realising that genetics – and particularly molecular ecology – was a cool way to learn more about critters inhabiting the marine environment. My focus on genetics was also strengthened by fieldwork; it turns out when I don’t have my eyes fixed firmly on the horizon, I get sea sick! During my PhD on sperm whale population genetics at Oregon State University (don’t worry, I worked on samples generously collected by other folks, thereby avoiding sea sickness), my descent continued further as I got my first exposure to “next-generation sequencing”.
You see, young-uns, back in the old days^, the order of beautiful, colourful squiggly lines known as Sanger chromatograms would give us the order of our DNA bases. This process was partly automated but still generally required a “check by eye” step to make sure nothing squiffy was going on.
However, current sequencing technologies are very different; they spit out millions of reads in a text-based format. Going through these by eye would take FOREVER. Instead, it became important to come up with automated ways to trim poor quality bits off the end of reads, to assemble them into bigger bits of sequence we are interested in, and to do other kinds of analyses.
In short, I had to learn how to do stuff on the COMMAND LINE *suspenseful music*. For some folks, this is super intimidating and the absolute worst. For me, I found it … fun. Figuring out how to get code to behave was kind of like a really nerdy (but useful) puzzle.
Building on this, each of my subsequent jobs had bioinformatics as a larger and larger slice of my job description. I upskilled through courses, workshops, but more importantly, googling the heck out of error messages!
The key bioinformatics skill needed is stubbornness (I will defeat you, code!). However, knowing when to reach out to folks either in person or on the interwebs for help is also useful. You also don’t have to be an ‘official bioinformatician’ for bioinformatics skills to be useful to you; a little proficiency on the command line is all it takes to be able to access all kinds of wonderful pre-written packages and pipelines that you can make do the analyses you want to do.
If this is something you are interested in but you don’t know where to start, good news! One of Genomics Aotearoa’s projects is building bioinformatics capability – helping empower researchers to be a little less scared of the command line.
One of the ways we are doing this is through software carpentry and data carpentry. These are great ways to break the ice as you are introduced to some of the cool things you can do with others who are in the same “this code stuff is super scary” boat (full disclosure: I’m a volunteer data carpentry instructor!).
Who knows? If you keep at it, and with enough random walking, you could be writing the next GA bioinformatics blog!
^ Sanger sequencing is still a super useful technology depending on the scale of projects you are working on. I miss those squiggles.