Metavisitor, a suite of Galaxy tools for simple and rapid detection and discovery of viruses in deep sequence data

We present user-friendly and adaptable software to provide biologists, clinical researchers and possibly diagnostic clinicians with the ability to robustly detect and reconstruct viral genomes from complex deep sequence datasets. A set of modular bioinformatic tools and workflows was implemented as the Metavisitor package in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor can be used on our Mississippi server, or can be installed on any Galaxy server instance and a pre-configured Metavisitor server image is provided. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions.

[1]  Martin Vingron,et al.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels , 2012, Bioinform..

[2]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[3]  Laurent Farinelli,et al.  De Novo Reconstruction of Consensus Master Genomes of Plant RNA and DNA Viruses from siRNAs , 2014, PloS one.

[4]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[5]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[6]  M. Vaslin,et al.  SearchSmallRNA: a graphical interface tool for the assemblage of viral genomes using small RNA libraries data , 2014, Virology Journal.

[7]  Gilles Faÿ,et al.  Características inmunológicas claves en la fisiopatología de la sepsis. Infectio , 2009 .

[8]  Reinhard Simon,et al.  Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: a generic method for diagnosis, discovery and sequencing of viruses. , 2009, Virology.

[9]  René L. Warren,et al.  Assembling millions of short DNA sequences using SSAKE , 2006, Bioinform..

[10]  J. Chun,et al.  Analytical Tools and Databases for Metagenomics in the Next-Generation Sequencing Era , 2013, Genomics & informatics.

[11]  Peng Cui,et al.  Dynamic regulation of genome-wide pre-mRNA splicing and stress tolerance by the Sm-like protein LSm5 in Arabidopsis , 2014, Genome Biology.

[12]  J. Derisi,et al.  Virus Identification in Unknown Tropical Febrile Illness Cases Using Deep Sequencing , 2012, PLoS neglected tropical diseases.

[13]  E. Koonin,et al.  Origins and evolution of viruses of eukaryotes: The ultimate modularity , 2015, Virology.

[14]  J. Montoya-Burgos,et al.  Optimization of de novo transcriptome assembly from next-generation sequencing data. , 2010, Genome research.

[15]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[16]  Olivier Voinnet,et al.  Antiviral Immunity Directed by Small RNAs , 2007, Cell.

[17]  D. Hultmark,et al.  Convergent Evolution of Argonaute-2 Slicer Antagonism in Two Distinct Insect RNA Viruses , 2012, PLoS pathogens.

[18]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[19]  Eric C Lai,et al.  Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs , 2010, Proceedings of the National Academy of Sciences.

[20]  D. Gatherer,et al.  Sequence-independent characterization of viruses based on the pattern of viral small RNAs produced by the host , 2015, Nucleic acids research.

[21]  N. Vodovar,et al.  In Silico Reconstruction of Viral Genomes from Small RNAs Improves Virus-Derived Small Interfering RNA Profiling , 2011, Journal of Virology.

[22]  James E. Johnson,et al.  NCBI BLAST+ integrated into Galaxy , 2015, bioRxiv.

[23]  R. Hardy,et al.  Insect antiviral innate immunity: pathways, effectors, and connections. , 2013, Journal of molecular biology.

[24]  I. Tzanetakis,et al.  Development of a virus detection and discovery pipeline using next generation sequencing. , 2014, Virology.

[25]  Claire L. Webster,et al.  The Discovery, Distribution, and Evolution of Viruses Associated with Drosophila melanogaster , 2015, bioRxiv.

[26]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[27]  B. Walker,et al.  Innate Lymphoid Cells Are Depleted Irreversibly during Acute HIV-1 Infection in the Absence of Viral Suppression. , 2016, Immunity.

[28]  Tim H. Brom,et al.  A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data , 2012, 1203.4802.

[29]  Eugeni Belda,et al.  Identification and Characterization of Two Novel RNA Viruses from Anopheles gambiae Species Complex Mosquitoes , 2016, PloS one.

[30]  Tarjei S Mikkelsen,et al.  Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples , 2014, Genome Biology.

[31]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[32]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.