By Ann McCartney
We have recently completed the first high-quality genome of a stick insect using link read technology, but what does this mean? And why is a gold standard reference genome important to New Zealand’s conservation efforts?
Stick insects are actually biologically interesting.
Firstly, in times of stress, they have the ability to become parthenogenic, meaning the females lay eggs without needing to mate with males to produce offspring.
Secondly, they are dynamic in a range of temperatures and altitudes in New Zealand. Better understanding of stick insect genomics will contribute much to global knowledge on the Phasmatodea species, including biogeographic origin, reproduction and temperature tolerance and its role in climate change.
Furthermore, several New Zealand species are highly endangered, with one species occupying the highest Department of Conservation threatened species category.
Stick insects are also interesting because their genome is very large – substantially bigger in fact than the human genome.
A high quality reference assembly provides more comprehensive genomic information than a fragmented one, so studies can then concentrate on managing important issues like reproduction and how these species are going to adapt to climate change.
So, producing a high-quality stick insect genome is really useful. However, previous efforts have been confounded by the genome’s highly repetitive content and size.
To surmount this problem new computational approaches have to be adopted and adapted, and having these new tools and techniques at our fingertips for further applications is where the stick insect has provided real value.
But to overcome the technical challenges, we needed high powered computing.
A supernova problem
Genomics Aotearoa High Quality Genomes project at Manaaki Whenua Landcare Research has been working on genomic assemblies for five endemic New Zealand species (rewarewa tree, the hihi bird and stick insects – Niveaphasma, Clitarchus, Acanthoxyla),
The repetitive nature of stick insect genomes meant reconstructing a genome accurately required long read sequencing technology. However traditional long read sequences are costly and the sheer size of the stick insect genome meant this was not feasible.
So instead, we employed technology not widely used in New Zealand – Chromium10X linked read technology.
This technology uses a unique barcode system to label short DNA sequences from individual molecules that are close to each other on the genome. They can then be linked to create longer sequence reads at a fraction of the price, and at the same time, they circumvent the contentious issue of long read base-call error rates.
Therefore, linked reads were the compromise between short read and long-read sequencing technologies as they provide pseudo long reads at a more reasonable price.
When we first outlined our plans of trying to run this new de novo sequencing application (Chromium10X linked read technology) to staff at NeSI (New Zealand eScience Infrastructure), it became clear this could be quite a technical challenge, as the computational environment required for these analyses is very complex.
We had to research what was available, create environments and test under different parameters.
NeSI provides Genomics Aotearoa with access to dedicated high-memory compute. Although Supernova, the sole assembler available for this new technology, was new to them, NeSI’s combination of bioinformatics expertise and high-performance computing capability successfully addressed the barriers.
It took three months to adapt techniques to get the first test of the genome assembly running, and the first complete assembly will be published this year. This has significant implications well beyond stick insects. Ultimately, it’s about the pipeline – developing techniques and software to produce a high quality genome assembly using linked reads that are reproducible and applicable to other genomes.
The implications of using link-read technology
The large and highly repetitive stick insect genome was the perfect test to determine a whether pseudo-long or linked read technology was a more affordable, and more appropriate, sequencing approach for genomes of this nature. Due to its success, other New Zealand species of interest are now being sequenced using this technology, including the blueberry, hihi, and myna.
The workflows we created using the linked read technology will make it considerably easier and quicker to produce quality genomic information for further endangered birds and animals.
It also has enabled Genomics Aotearoa to decide what other genomes from species of interest within our conservation and primary production programmes can be sequenced and assembled using this technology.