Maximal viral information recovery from sequence data using VirMAP

Accurate classification of the human virome is critical to a full understanding of the role viruses play in health and disease. This implies the need for sensitive, specific, and practical pipelines that return precise outputs while still enabling case-specific post hoc analysis. Viral taxonomic characterization from metagenomic data suffers from high background noise and signal crosstalk that confounds current methods. Here we develop VirMAP that overcomes these limitations using techniques that merge nucleotide and protein information to taxonomically classify viral reconstructions independent of genome coverage or read overlap. We validate VirMAP using published data sets and viral mock communities containing RNA and DNA viruses and bacteriophages. VirMAP offers opportunities to enhance metagenomic studies seeking to define virome-host interactions, improve biosurveillance capabilities, and strengthen molecular epidemiology reporting.Viral taxonomic characterization from metagenomic data suffers from high background noise and signal crosstalk. Here, the authors develop VirMAP, a novel pipeline for analyses of metagenomic data that classifies viral reconstructions independent of genome coverage or read overlap.

[1]  S. Rampelli,et al.  ViromeScan: a new tool for metagenomic viral community profiling , 2016, BMC Genomics.

[2]  Hing-Fung Ting,et al.  MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices. , 2016, Methods.

[3]  Daniel J. Nasko,et al.  VIROME: a standard operating procedure for analysis of viral metagenome sequences , 2012, Standards in genomic sciences.

[4]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[5]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[6]  François Enault,et al.  Metavir 2: new tools for viral metagenome comparison and assembled virome analysis , 2014, BMC Bioinformatics.

[7]  Z. Fei,et al.  Complete Genome Sequence of Southern tomato virus Naturally Infecting Tomatoes in Bangladesh , 2015, Genome Announcements.

[8]  M. Aepfelbacher,et al.  Evaluation of Unbiased Next-Generation Sequencing of RNA (RNA-seq) as a Diagnostic Method in Influenza Virus-Positive Respiratory Samples , 2015, Journal of Clinical Microbiology.

[9]  Liqing Zhang,et al.  Fast Virome Explorer: A Pipeline for Virus and Phage Identification and Abundance Profiling in Metagenomics Data , 2017, medRxiv.

[10]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[11]  M. Nykter,et al.  Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples , 2017, BMC Genomics.

[12]  M. Kuroda,et al.  VirusTAP: Viral Genome-Targeted Assembly Pipeline , 2016, Front. Microbiol..

[13]  Jelle Matthijnssens,et al.  Modular approach to customise sample preparation procedures for viral metagenomics: a reproducible protocol for virome analysis , 2015, Scientific Reports.

[14]  Suisha Liang,et al.  Taxonomic structure and functional association of foxtail millet root microbiome , 2017, GigaScience.

[15]  Yi Zhang,et al.  VIP: an integrated pipeline for metagenomics of virus identification and discovery , 2016, Scientific Reports.

[16]  Tim H. Brom,et al.  A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data , 2012, 1203.4802.

[17]  Duy Tin Truong,et al.  MetaPhlAn2 for enhanced metagenomic taxonomic profiling , 2015, Nature Methods.

[18]  Guoyan Zhao,et al.  VirusSeeker, a computational pipeline for virus discovery and virome composition analysis. , 2017, Virology.

[19]  C. Huttenhower,et al.  Metagenomic microbial community profiling using unique clade-specific marker genes , 2012, Nature Methods.

[20]  Anders Krogh,et al.  Fast and sensitive taxonomic classification for metagenomics with Kaiju , 2016, Nature Communications.

[21]  Yu-Chieh Liao,et al.  drVM: a new tool for efficient genome assembly of known eukaryotic viruses from metagenomes , 2017, GigaScience.