De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms

Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable of producing long sequencing reads with average fragment lengths of over 10,000 base-pairs and maximum lengths reaching 100,000 base- pairs. Compared with short reads, the assemblies obtained from long-read sequencing platforms have much higher contig continuity and genome completeness as long fragments are able to extend paths into problematic or repetitive regions. Many successful assembly applications of the Pacific Biosciences technology have been reported ranging from small bacterial genomes to large plant and animal genomes. Recently, genome assemblies using Oxford Nanopore MinION data have attracted much attention due to the portability and low cost of this novel sequencing instrument. In this paper, we re-sequenced a well characterized genome, the Saccharomyces cerevisiae S288C strain using three different platforms: MinION, PacBio and MiSeq. We present a comprehensive metric comparison of assemblies generated by various pipelines and discuss how the platform associated data characteristics affect the assembly quality. With a given read depth of 31X, the assemblies from both Pacific Biosciences and Oxford Nanopore MinION show excellent continuity and completeness for the 16 nuclear chromosomes, but not for the mitochondrial genome, whose reconstruction still represents a significant challenge.

[1]  Niranjan Nagarajan,et al.  Fast and sensitive mapping of nanopore sequencing reads with GraphMap , 2016, Nature Communications.

[2]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[3]  Lisa C. Crossman,et al.  Identification of bacterial pathogens and antimicrobial resistance directly from clinical urines by nanopore-based metagenomic sequencing , 2016, The Journal of antimicrobial chemotherapy.

[4]  Robert J. Fischer,et al.  Nanopore Sequencing as a Rapidly Deployable Ebola Outbreak Tool , 2016, Emerging infectious diseases.

[5]  N. Loman,et al.  A complete bacterial genome assembled de novo using only nanopore sequencing data , 2015, Nature Methods.

[6]  J. Landolin,et al.  Assembling large genomes with single-molecule sequencing and locality-sensitive hashing , 2014, Nature Biotechnology.

[7]  Dmitry Antipov,et al.  Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads , 2013, RECOMB.

[8]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[9]  S. Koren,et al.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation , 2016, bioRxiv.

[10]  M. Schatz,et al.  Hybrid error correction and de novo assembly of single-molecule sequencing reads , 2012, Nature Biotechnology.

[11]  Joseph M. DeSimone,et al.  Compliant glass–polymer hybrid single ion-conducting electrolytes for lithium batteries , 2015, Proceedings of the National Academy of Sciences.

[12]  Doug Stryke,et al.  Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis , 2015, Genome Medicine.

[13]  Stefan Engelen,et al.  de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer , 2016, bioRxiv.

[14]  Aaron R. Quinlan,et al.  Poretools: a toolkit for analyzing nanopore sequence data , 2014, bioRxiv.

[15]  Dmitry Antipov,et al.  hybridSPAdes: an algorithm for hybrid assembly of short and long reads , 2016, Bioinform..

[16]  Julian Parkhill,et al.  Early insights into the potential of the Oxford Nanopore MinION for the detection of antimicrobial resistance genes , 2015, The Journal of antimicrobial chemotherapy.

[17]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[18]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[20]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[21]  Sergey Koren,et al.  Aggressive assembly of pyrosequencing reads with mates , 2008, Bioinform..

[22]  R. Durbin,et al.  Contrasting genome dynamics between domesticated and wild yeasts , 2016, bioRxiv.

[23]  Oliver G. Pybus,et al.  Mobile real-time surveillance of Zika virus in Brazil , 2016, Genome Medicine.

[24]  Heng Li,et al.  Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences , 2015, Bioinform..

[25]  Christopher J. Lee,et al.  Multiple sequence alignment using partial order graphs , 2002, Bioinform..

[26]  Joe Parker,et al.  Field-based species identification in eukaryotes using single molecule, real-time sequencing , 2017, bioRxiv.

[27]  David A. Matthews,et al.  Real-time, portable genome sequencing for Ebola surveillance , 2016, Nature.

[28]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[29]  Minh Duc Cao,et al.  Scaffolding and completing genome assemblies in real-time with nanopore sequencing , 2016, Nature Communications.

[30]  Jing Li,et al.  Contrasting evolutionary genome dynamics between domesticated and wild yeasts , 2017, Nature Genetics.

[31]  Pavel A. Pevzner,et al.  Assembly of long error-prone reads using de Bruijn graphs , 2016, Proceedings of the National Academy of Sciences.

[32]  Niranjan Nagarajan,et al.  Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[33]  Lin Liu,et al.  Comparison of Next-Generation Sequencing Systems , 2012, Journal of biomedicine & biotechnology.