Flexible design of multiple metagenomics classification pipelines with UGENE

SUMMARY UGENE is a free, open-source, cross-platform bioinformatics software. UGENE deploys pre-defined pipelines and a flexible instrument to design new workflows and visually build multi-step analytics pipelines. The new UGENE v.1.31 release offers graphical, user-friendly wrapping of a number of popular command-line metagenomics classification programs (Kraken, CLARK, DIAMOND), combinable serially and in parallel through the workflow designer, with multiple, customizable reference databases. Ensemble classification voting is available through the WEVOTE algorithm, with augmented output in the form of detailed table reports. Pre-built workflows (which include all steps from data cleaning to summaries) are included with the installation and a tutorial is available on the UGENE website. Further expansion with multiple visualization tools for reports is planned. AVAILABILITY AND IMPLEMENTATION UGENE is available at http://ugene.net/, implemented in C++ and Qt, and released under GNU General Public License (GPL) version 2.

[1]  S. Schuster,et al.  Integrative analysis of environmental sequences using MEGAN4. , 2011, Genome research.

[2]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[3]  D. Mende,et al.  Environmental drivers of a microbial genomic transition zone in the ocean’s interior , 2017, Nature Microbiology.

[4]  F. Bushman,et al.  The human gut virome: inter-individual variation and dynamic response to diet. , 2011, Genome research.

[5]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[6]  S. Lonardi,et al.  CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers , 2015, BMC Genomics.

[7]  Olga Golosova,et al.  Unipro UGENE: a unified bioinformatics toolkit , 2012, Bioinform..

[8]  Patricia W. Finn,et al.  WEVOTE: Weighted Voting Taxonomic Identification Method of Microbial Sequences , 2016, bioRxiv.

[9]  Franco Milicchio,et al.  Visual programming for next-generation sequencing data analytics , 2016, BioData Mining.

[10]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[11]  P. Bork,et al.  Patterns and ecological drivers of ocean viral communities , 2015, Science.

[12]  C. Huttenhower,et al.  Metagenomic microbial community profiling using unique clade-specific marker genes , 2012, Nature Methods.

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..