Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim

Nanopore sequencing is crucial to metagenomic studies as its kilobase-long reads can contribute to resolving genomic structural differences among microbes. However, platform-specific challenges, including high base-call error rate, non-uniform read lengths, and the presence of chimeric artifacts, necessitate specifically designed analytical tools. Here, we present Meta-NanoSim, a fast and versatile utility that characterizes and simulates the unique properties of nanopore metagenomic reads. Further, Meta-NanoSim improves upon state-of-the-art methods on microbial abundance estimation through a base-level quantification algorithm. We demonstrate that Meta-NanoSim simulated data can facilitate the development of metagenomic algorithms and guide experimental design through a metagenomic assembly benchmarking task.

[1]  R. Leggett,et al.  Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples , 2021, Genome Biology.

[2]  D. Rasko,et al.  Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes , 2021, G3.

[3]  Astrid Gall,et al.  Ensembl 2021 , 2020, Nucleic Acids Res..

[4]  Inanc Birol,et al.  Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data , 2020, GigaScience.

[5]  S. Lo,et al.  A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster , 2020, The Lancet.

[6]  Chirag Jain,et al.  Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps , 2019, Nature Communications.

[7]  Gemma L. Kay,et al.  Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower respiratory infection , 2019, Nature Biotechnology.

[8]  Pierre Marijon,et al.  yacrd and fpa: upstream tools for long-read genome assembly , 2019, bioRxiv.

[9]  Richard M. Leggett,et al.  Alvis: a tool for contig and read ALignment VISualisation and chimera detection , 2019, BMC Bioinformatics.

[10]  P. Pevzner,et al.  metaFlye: scalable long-read metagenome assembly using repeat graphs , 2019, Nature Methods.

[11]  Jennifer M. Fettweis,et al.  The Integrative Human Microbiome Project , 2019, Nature.

[12]  Paul Theodor Pyl,et al.  Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer , 2019, Nature Medicine.

[13]  Paul Theodor Pyl,et al.  Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer , 2019, Nature Medicine.

[14]  Kin Fai Au,et al.  A comparative evaluation of hybrid error correction methods for error-prone long reads , 2019, Genome Biology.

[15]  M. A. Suchard,et al.  Metagenomic sequencing at the epicenter of the Nigeria 2018 Lassa fever outbreak , 2019, Science.

[16]  Shuiquan Tang,et al.  Ultra-deep, long-read nanopore sequencing of mock microbial community standards , 2018, bioRxiv.

[17]  Alexander Payne,et al.  BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files , 2018, Bioinform..

[18]  J. Blanchard,et al.  Hidden diversity of soil giant viruses , 2018, Nature Communications.

[19]  Ryan R Wick,et al.  Deepbinner: Demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks , 2018, bioRxiv.

[20]  T. Peto,et al.  Detection of Viral Pathogens With Multiplex Nanopore MinION Sequencing: Be Careful With Cross-Talk , 2018, bioRxiv.

[21]  Aere,et al.  CAMISIM: simulating metagenomes and microbial communities , 2018, Microbiome.

[22]  Johanna Daily,et al.  Human microbiome signatures of differential colorectal cancer drug metabolism , 2017, npj Biofilms and Microbiomes.

[23]  N. Segata,et al.  Shotgun metagenomics, from sampling to analysis , 2017, Nature Biotechnology.

[24]  David A. Eccles,et al.  Investigation of chimeric reads using the MinION , 2017, F1000Research.

[25]  Geet Duggal,et al.  Salmon: fast and bias-aware quantification of transcript expression using dual-phase inference , 2017, Nature Methods.

[26]  R. Franklin,et al.  MinION TM nanopore sequencing of environmental metagenomes: a synthetic approach , 2017 .

[27]  Paolo Piazza,et al.  Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis , 2017, F1000Research.

[28]  Steven Salzberg,et al.  Bracken: Estimating species abundance in metagenomics data , 2016, bioRxiv.

[29]  Justin Chu,et al.  NanoSim: nanopore sequence read simulator based on statistical characterization , 2016, bioRxiv.

[30]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[31]  Doug Stryke,et al.  Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis , 2015, Genome Medicine.

[32]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[33]  Chaochun Wei,et al.  NeSSM: A Next-Generation Sequencing Simulator for Metagenomics , 2013, PloS one.

[34]  Lior Pachter,et al.  Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities , 2005, PLoS Comput. Biol..

[35]  J. Handelsman Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.

[36]  of Integrative , 2022 .