Ultra-deep, long-read nanopore sequencing of mock microbial community standards

Abstract Background Long sequencing reads are information-rich: aiding de novo assembly and reference mapping, and consequently have great potential for the study of microbial communities. However, the best approaches for analysis of long-read metagenomic data are unknown. Additionally, rigorous evaluation of bioinformatics tools is hindered by a lack of long-read data from validated samples with known composition. Findings We sequenced 2 commercially available mock communities containing 10 microbial species (ZymoBIOMICS Microbial Community Standards) with Oxford Nanopore GridION and PromethION. Both communities and the 10 individual species isolates were also sequenced with Illumina technology. We generated 14 and 16 gigabase pairs from 2 GridION flowcells and 150 and 153 gigabase pairs from 2 PromethION flowcells for the evenly distributed and log-distributed communities, respectively. Read length N50 ranged between 5.3 and 5.4 kilobase pairs over the 4 sequencing runs. Basecalls and corresponding signal data are made available (4.2 TB in total). Alignment to Illumina-sequenced isolates demonstrated the expected microbial species at anticipated abundances, with the limit of detection for the lowest abundance species below 50 cells (GridION). De novo assembly of metagenomes recovered long contiguous sequences without the need for pre-processing techniques such as binning. Conclusions We present ultra-deep, long-read nanopore datasets from a well-defined mock community. These datasets will be useful for those developing bioinformatics methods for long-read metagenomics and for the validation and comparison of current laboratory and software pipelines.

[1]  Winston Timp,et al.  Detecting DNA cytosine methylation using nanopore sequencing , 2017, Nature Methods.

[2]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[3]  Heng Li,et al.  Fast and accurate long-read assembly with wtdbg2 , 2019, Nature Methods.

[4]  Alexander Payne,et al.  BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files , 2018, Bioinform..

[5]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[6]  Shabhonam Caim,et al.  Rapid profiling of the preterm infant gut microbiota using nanopore sequencing aids pathogen diagnostics , 2018, bioRxiv.

[7]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[8]  Christian H. Ahrens,et al.  Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system , 2018, BMC Microbiology.

[9]  J. Handelsman Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.

[10]  Jim Shaw,et al.  Nanopore sequencing enables high-resolution analysis of resistance determinants and mobile elements in the human gut microbiome , 2018, bioRxiv.

[11]  C. Ahrens,et al.  Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system , 2018, BMC Microbiology.

[12]  Jaysheel D. Bhavsar,et al.  Metagenomics: Read Length Matters , 2008, Applied and Environmental Microbiology.

[13]  Shawn Levy,et al.  International Standards for Genomes, Transcriptomes, and Metagenomes. , 2017, Journal of biomolecular techniques : JBT.

[14]  Sven Rahmann,et al.  Snakemake--a scalable bioinformatics workflow engine. , 2012, Bioinformatics.

[15]  Connor T. Skennerton,et al.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes , 2015, Genome research.

[16]  Satinderjit Singh,et al.  An Alternate Algorithm for (3x3) Median Filtering of Digital Images , 2012, BIOINFORMATICS 2012.

[17]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[18]  Aaron A. Klammer,et al.  Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data , 2013, Nature Methods.

[19]  Shuiquan Tang,et al.  Ultra-deep, long-read nanopore sequencing of mock microbial community standards , 2018 .

[20]  Stefano Lonardi,et al.  Comprehensive Benchmarking and Ensemble Approaches for Metagenomic Classifiers , 2017 .

[21]  Sven Rahmann,et al.  Genome analysis , 2022 .

[22]  Richard M. Leggett,et al.  Rapid Diagnosis of Lower Respiratory Infection using Nanopore-based Clinical Metagenomics , 2018, bioRxiv.

[23]  N. Segata,et al.  Shotgun metagenomics, from sampling to analysis , 2017, Nature Biotechnology.

[24]  Philip D. Blood,et al.  Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.

[25]  Niranjan Nagarajan,et al.  Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[26]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[27]  Jeremy Swann,et al.  Real-time analysis of nanopore-based metagenomic sequencing from infected orthopaedic devices , 2018, BMC Genomics.

[28]  S. Koren,et al.  Nanopore sequencing and assembly of a human genome with ultra-long reads , 2017, bioRxiv.

[29]  Brian C. Thomas,et al.  A new view of the tree of life , 2016, Nature Microbiology.

[30]  Irina Bessarab,et al.  MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs , 2017, Biology Direct.

[31]  Heike Sichtig,et al.  Single-molecule sequencing detection of N6-methyladenine in microbial reference materials , 2019, Nature Communications.

[32]  Joel Ackelsberg,et al.  Lack of Evidence for Plague or Anthrax on the New York City Subway. , 2015, Cell systems.

[33]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..