Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias

Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed the quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.

[1]  Michael C. Schatz,et al.  Oxford Nanopore Sequencing, Hybrid Error Correction, and de novo Assembly of a Eukaryotic Genome , 2015 .

[2]  S. Turner,et al.  Real-Time DNA Sequencing from Single Polymerase Molecules , 2009, Science.

[3]  Matthew Loose,et al.  Real-time selective sequencing using nanopore technology , 2016, Nature Methods.

[4]  Stefan Engelen,et al.  Genome assembly using Nanopore-guided long and error-free DNA reads , 2015, BMC Genomics.

[5]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[6]  S. Turner,et al.  Real-time DNA sequencing from single polymerase molecules. , 2010, Methods in enzymology.

[7]  Jiannis Ragoussis,et al.  Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations , 2016, Scientific Reports.

[8]  N. Lennon,et al.  Characterizing and measuring bias in sequence data , 2013, Genome Biology.

[9]  Kin-Fan Au,et al.  PacBio Sequencing and Its Applications , 2015, Genom. Proteom. Bioinform..

[10]  Joshua Quick,et al.  Rapid draft sequencing and real-time nanopore sequencing in a hospital outbreak of Salmonella , 2015, Genome Biology.

[11]  Thomas Abeel,et al.  Nanopore sequencing enables near-complete de novo assembly of Saccharomyces cerevisiae reference strain CEN.PK113-7D , 2017, bioRxiv.

[12]  Hugh E. Olsen,et al.  The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community , 2016, Genome Biology.

[13]  Trevor Bedford,et al.  Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples , 2017, Nature Protocols.

[14]  Patrick Wincker,et al.  High-Quality de Novo Genome Assembly of the Dekkera bruxellensis Yeast Using Nanopore MinION Sequencing , 2017, G3: Genes, Genomes, Genetics.

[15]  T. Fennell,et al.  Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries , 2011, Genome Biology.

[16]  M. Forsman,et al.  Scaffolding of a bacterial genome using MinION nanopore sequencing , 2015, Scientific Reports.

[17]  Michael Liem,et al.  Rapid de novo assembly of the European eel genome from nanopore sequencing reads , 2017, Scientific Reports.

[18]  Y. Benjamini,et al.  Summarizing and correcting the GC content bias in high-throughput sequencing , 2012, Nucleic acids research.

[19]  Sergey Koren,et al.  De Novo Assembly of a New Solanum pennellii Accession Using Nanopore Sequencing[CC-BY] , 2017, Plant Cell.

[20]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[21]  Mark Akeson,et al.  Reading canonical and modified nucleotides in 16S ribosomal RNA using nanopore direct RNA sequencing , 2017, bioRxiv.

[22]  Phelim Bradley,et al.  Same-Day Diagnostic and Surveillance Data for Tuberculosis via Whole-Genome Sequencing of Direct Respiratory Samples , 2016, Journal of Clinical Microbiology.

[23]  J. Landolin,et al.  Assembling large genomes with single-molecule sequencing and locality-sensitive hashing , 2014, Nature Biotechnology.

[24]  Tyson A. Clark,et al.  Direct detection of DNA methylation during single-molecule, real-time sequencing , 2010, Nature Methods.

[25]  Charles E. Lawrence,et al.  Sequencing ultra-long DNA molecules with the Oxford Nanopore MinION , 2015, bioRxiv.

[26]  Richard J. Hall,et al.  MinION nanopore sequencing of an influenza genome , 2015, Front. Microbiol..

[27]  Jordan M. Eizenga,et al.  Mapping DNA Methylation with High Throughput Nanopore Sequencing , 2017, Nature Methods.

[28]  Benedict Paten,et al.  Improved data analysis for the MinION nanopore sequencer , 2015, Nature Methods.

[29]  Doug Stryke,et al.  Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis , 2015, Genome Medicine.

[30]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[31]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[32]  Daniel R. Garalde,et al.  Highly parallel direct RNA sequencing on an array of nanopores , 2016, Nature Methods.

[33]  S. Koren,et al.  Nanopore sequencing and assembly of a human genome with ultra-long reads , 2017, bioRxiv.

[34]  T Laver,et al.  Assessing the performance of the Oxford Nanopore Technologies MinION , 2015, Biomolecular detection and quantification.

[35]  T. Blauwkamp,et al.  Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events , 2015, Nature Biotechnology.

[36]  Tsunglin Liu,et al.  Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly , 2013, PloS one.

[37]  David A. Matthews,et al.  Real-time, portable genome sequencing for Ebola surveillance , 2016, Nature.