Back to top anchor
Open main menu Close main menu

Data analysis skills are in hot demand

Mik Black

Mik Black

What should we be doing about it?

By Mik Black

29 July 2020

The increased availability of complex biological data sets means that analysis and computation are becoming critically important skills for New Zealand’s future scientists. Because of this, we need to be doing everything we can to help our students develop these skills, to better prepare them for large-scale data analysis across a range of fields.

At present, students focusing on the biological sciences in New Zealand tend not to take the more “computational” subjects – statistics, mathematics, computer and/or information science – yet the growing importance of data manipulation and analysis in genomics make these important and sought-after skills.

While there are now more students coming through biology-focused programmes like genetics and biochemistry who are starting to include computational/analytic training (great to see), those numbers need to grow.

What is driving the need for data analysis capability in genomic science?

Over the past two decades, the field of genomics has moved at an incredible pace, and the volume of biological data we are able to produce with high throughput methods is staggering.

There is now widespread acceptance that genomic technologies can help us to understand the world around us, and in particular improve human health.

It also has huge implications for conservation – genomics has the potential to play an important role in helping to slow the global decline in biodiversity.

While it is relatively easy to generate large amounts of genomic data about any organisms, it’s what you do with the information that is important. And this is what has fuelled the strong demand for analysis and interpretation, putting disciplines like Bioinformatics and Data Science in hot demand.

Bioinformatics is the development of methods and software tools for understanding the biological data derived from genomics. It is an evolving science, but is essential to manage data in modern medicine and biology.

For me, as a statistician, Data Science is like statistics on steroids. While everyone has a slightly different definition, it is largely a blend of key skills from mathematics, statistics and computer/information science, focusing on the management, manipulation, visualisation and analysis of large and sometimes disparate data sets.

The combination of these skill sets – bioinformatics and data science – is what will help us address the considerable computational challenges involved in developing pipelines that take raw biological data as input, and output practical tools for use by health professionals and those at the front lines of primary production and conservation.

While New Zealand definitely has high-quality people that are skilled in these areas, we need a whole lot more to help us find solutions to some of our country’s major challenges, such as health inequities.

A comprehensive approach to building this capability will allow us to accomplish many things, from implementing precision medicine models to help doctors choose treatments that are a genetic match for their patients, to developing practical applications to help save endangered species like the kākāpō.

So, what are we doing?

The government has recognised the importance of genomics to New Zealand’s future. It has funded Genomics Aotearoa to build bioinformatics capability, to make use of genomics research already happening internationally, and to use New Zealand-specific genomic information for good in our own country.

A planned programme of training within Genomics Aotearoa is aimed at retrofitting these skills for New Zealand genomics researchers. We have already started programmes to train people who can then go on to teach tools to people in their own organisations.

In partnership with the New Zealand eScience Infrastructure (NeSI), we have been delivering Carpentries training (www.carpentries.org) to teach generic skills in programming, version control and reproducibility to researchers who have not been exposed to these concepts as part of their formal education.

We are now using the Carpentries model to deliver training for bioinformatics skills. Genomics Aotearoa was one of the first in the world to run the Genomics Data Carpentry workshop, delivered to almost 200 New Zealand researchers in its first six months, and more workshops are on the way – Metagenomics, RNA-sequencing, Genotyping-by-Sequencing, Reproducible Research for Genomics – all free and open source.

We are already making a huge difference for students and researchers. But we also have a responsibility as scientists to ensure that school and university students are aware of the importance of data analyses in all science subjects, and are being encouraged to develop these skills.

Bioinformatics and Data Science skills need to be seen as a fundamental component of modern genomic science. Universities can still do more to incorporate these skills into our existing papers, and in the longer term, we would like to see courses teaching these skills being integrated into university degree programmes to reflect their growing importance across the entire science research spectrum.

Associate Professor Mik Black works in the Department of Biochemistry, University of Otago, and leads the Genomics Aotearoa Bioinformatic Capability project.

More Genomics Aotearoa blogs