Comparison of long read sequencing technologies in resolving bacteria and fly genomes

Background The newest generation of DNA sequencing technology is highlighted by the ability to sequence reads hundreds of kilobases in length, and the increased availability of long read data has democratized the genome sequencing and assembly process. PacBio and Oxford Nanopore Technologies (ONT) have pioneered competitive long read platforms, with more recent work focused on improving sequencing throughput and per-base accuracy. Released in 2019, the PacBio Sequel II platform advertises substantial enhancements over previous PacBio systems. Results We used whole-genome sequencing data produced by two PacBio platforms (Sequel II and RS II) and two ONT protocols (Rapid Sequencing and Ligation Sequencing) to compare assemblies of the bacteria Escherichia coli and the fruit fly Drosophila ananassae. Sequel II assemblies had higher contiguity and consensus accuracy relative to other methods, even after accounting for differences in sequencing throughput. ONT RAPID libraries had the fewest chimeric reads in addition to superior quantification of E. coli plasmids versus ligation-based libraries. The quality of assemblies can be enhanced by adopting hybrid approaches using Illumina libraries for bacterial genome assemblies or combined ONT and Sequel II libraries for eukaryotic genome assemblies. Genome-wide DNA methylation could be detected using both technologies, however ONT libraries enabled the identification of a broader range of known E. coli methyltransferase recognition motifs in addition to undocumented D. ananassae motifs. Conclusions The ideal choice of long read technology may depend on several factors including the question or hypothesis under examination. No single technology outperformed others in all metrics examined.

[1]  Suvarna Nadendla,et al.  Complete Genome Sequence of wAna, the Wolbachia Endosymbiont of Drosophila ananassae , 2019, Microbiology Resource Announcements.

[2]  Sergey Koren,et al.  Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome , 2019, Nature Biotechnology.

[3]  Richard M. Leggett,et al.  Alvis: a tool for contig and read ALignment VISualisation and chimera detection , 2019, BMC Bioinformatics.

[4]  J. Korlach,et al.  A high-quality genome assembly from a single, field-collected spotted lanternfly (Lycorma delicatula) using the PacBio Sequel II system , 2019, bioRxiv.

[5]  Christina Backes,et al.  PLSDB: a resource of complete bacterial plasmids , 2018, Nucleic Acids Res..

[6]  Kin Fai Au,et al.  A comparative evaluation of hybrid error correction methods for error-prone long reads , 2019, Genome Biology.

[7]  Anthony R. Borneman,et al.  Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies , 2018, BMC Bioinformatics.

[8]  Claude Thermes,et al.  The Third Revolution in Sequencing Technology. , 2018, Trends in genetics : TIG.

[9]  O. Bouchez,et al.  The complete methylome of an entomopathogenic bacterium reveals the existence of loci with unmethylated Adenines , 2018, Scientific Reports.

[10]  Julia Zeitlinger,et al.  Highly Contiguous Genome Assemblies of 15 Drosophila Species Generated Using Nanopore Sequencing , 2018, G3: Genes, Genomes, Genetics.

[11]  A. Larracuente,et al.  Heterochromatin-Enriched Assemblies Reveal the Sequence and Organization of the Drosophila melanogaster Y Chromosome , 2018, Genetics.

[12]  S. Pradhan,et al.  Levels of DNA cytosine methylation in the Drosophila genome , 2018, PeerJ.

[13]  Dmitry Antipov,et al.  Versatile genome assembly evaluation with QUAST-LG , 2018, Bioinform..

[14]  Fritz J Sedlazeck,et al.  Piercing the dark matter: bioinformatics of long-range sequencing and mapping , 2018, Nature Reviews Genetics.

[15]  Yu Lin,et al.  Assembly of long, error-prone reads using repeat graphs , 2018, Nature Biotechnology.

[16]  Wouter De Coster,et al.  NanoPack: visualizing and processing long-read sequencing data , 2018, bioRxiv.

[17]  Julie C. Dunning Hotopp,et al.  The Complexities and Nuances of Analyzing the Genome of Drosophila ananassae and Its Wolbachia Endosymbiont , 2018, G3: Genes, Genomes, Genetics.

[18]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[19]  Robert M. Waterhouse,et al.  BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics , 2017, bioRxiv.

[20]  David A. Eccles,et al.  Investigation of chimeric reads using the MinION , 2017, F1000Research.

[21]  Ryan R. Wick,et al.  Completing bacterial genome assemblies with multiplex MinION sequencing , 2017, bioRxiv.

[22]  Sarah K. Hilton,et al.  Retrotransposons Are the Major Contributors to the Expansion of the Drosophila ananassae Muller F Element , 2017, G3: Genes|Genomes|Genetics.

[23]  Jing Li,et al.  De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms , 2017, Scientific Reports.

[24]  Zamin Iqbal,et al.  Resolving plasmid structures in Enterobacteriaceae using the MinION nanopore sequencer: assessment of MinION and MinION/Illumina hybrid data assembly approaches , 2017, Microbial genomics.

[25]  Richard M Leggett,et al.  MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry , 2017, F1000Research.

[26]  Justin Chu,et al.  ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter , 2016, bioRxiv.

[27]  Winston Timp,et al.  Detecting DNA cytosine methylation using nanopore sequencing , 2017, Nature Methods.

[28]  S. Koren,et al.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation , 2016, bioRxiv.

[29]  Jordan M. Eizenga,et al.  Mapping DNA Methylation with High Throughput Nanopore Sequencing , 2017, Nature Methods.

[30]  Niranjan Nagarajan,et al.  Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[31]  I. Birol,et al.  Innovations and challenges in detecting long read overlaps: an evaluation of the state-of-the-art , 2016, Bioinform..

[32]  Evgeny M. Zdobnov,et al.  OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs , 2016, Nucleic Acids Res..

[33]  Ryan R. Wick,et al.  Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads , 2016, bioRxiv.

[34]  Yan Li,et al.  SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation , 2016, PloS one.

[35]  James G. Baldwin-Brown,et al.  Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage , 2016, bioRxiv.

[36]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[37]  Jacqueline A. Keane,et al.  Circlator: automated circularization of genome assemblies using long sequencing reads , 2015, Genome Biology.

[38]  B. Marshall,et al.  The complete methylome of Helicobacter pylori UM032 , 2015, BMC Genomics.

[39]  J. Casadesús,et al.  DNA methylation in bacteria: from the methyl group to the methylome. , 2015, Current opinion in microbiology.

[40]  Karsten B. Sieber,et al.  Extensive duplication of the Wolbachia DNA in chromosome four of Drosophila ananassae , 2014, BMC Genomics.

[41]  D. Boffelli,et al.  Now you see it: genome methylation makes a comeback in Drosophila. , 2014, BioEssays : news and reviews in molecular, cellular and developmental biology.

[42]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[43]  David I. K. Martin,et al.  Genome methylation in D. melanogaster is found at specific short motifs and is independent of DNMT2 activity , 2014, Genome research.

[44]  G. Hannon,et al.  Dnmt2-dependent methylomes lack defined DNA methylation patterns , 2013, Proceedings of the National Academy of Sciences.

[45]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[46]  R. Norman,et al.  Microbial phylogenetic profiling with the Pacific Biosciences sequencing platform , 2013, Microbiome.

[47]  Inanç Birol,et al.  Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species , 2013, GigaScience.

[48]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[49]  Jonas Korlach,et al.  Enhanced 5-methylcytosine detection in single-molecule, real-time sequencing via Tet1 oxidation , 2012, BMC Biology.

[50]  H. Swerdlow,et al.  A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers , 2012, BMC Genomics.

[51]  Richard J. Roberts,et al.  Characterization of DNA methyltransferase specificities using single-molecule, real-time DNA sequencing , 2011, Nucleic acids research.

[52]  Detecting DNA Base Modifications Using Single Molecule , Real-Time Sequencing , 2012 .

[53]  T. Pan,et al.  Quantification Bias Caused by Plasmid DNA Conformation in Quantitative Real-Time PCR Assay , 2011, PloS one.

[54]  D. Zilberman,et al.  Genome-Wide Evolutionary Analysis of Eukaryotic DNA Methylation , 2010, Science.

[55]  R. Roberts,et al.  REBASE—a database for DNA restriction and modification: enzymes, genes and genomes , 2009, Nucleic Acids Res..

[56]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[57]  Tetsuya Hayashi,et al.  Complete Genome Sequence and Comparative Genome Analysis of Enteropathogenic Escherichia coli O127:H6 Strain E2348/69 , 2008, Journal of bacteriology.

[58]  Josh Goodman,et al.  Polytene chromosomal maps of 11 Drosophila species: the order of genomic scaffolds inferred from genetic and physical maps. , 2008, Genetics.

[59]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[60]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[61]  Melissa Bastide,et al.  Assembling Genomic DNA Sequences with PHRAP , 2007, Current protocols in bioinformatics.

[62]  T. Bestor,et al.  Eukaryotic cytosine methyltransferases. , 2005, Annual review of biochemistry.

[63]  A. Pauli,et al.  Conservation of DNA methylation in dipteran insects , 2004, Insect molecular biology.

[64]  F. Lyko,et al.  A Dnmt2-like protein mediates DNA methylation in Drosophila , 2003, Development.

[65]  Thomas D. Schmittgen,et al.  Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. , 2001, Methods.

[66]  R. Jaenisch,et al.  Development: DNA methylation in Drosophila melanogaster , 2000, Nature.

[67]  George M. Church,et al.  Quantitative whole-genome analysis of DNA-protein interactions by in vivo methylase protection in E. coli , 1998, Nature Biotechnology.

[68]  Y. Tobari Drosophila ananassae: genetical and biological aspects. , 1993 .

[69]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[70]  E. Scarano,et al.  DNA Methylation , 1973, Nature.