METAGENassist: a comprehensive web server for comparative metagenomics

With recent improvements in DNA sequencing and sample extraction techniques, the quantity and quality of metagenomic data are now growing exponentially. This abundance of richly annotated metagenomic data and bacterial census information has spawned a new branch of microbiology called comparative metagenomics. Comparative metagenomics involves the comparison of bacterial populations between different environmental samples, different culture conditions or different microbial hosts. However, in order to do comparative metagenomics, one typically requires a sophisticated knowledge of multivariate statistics and/or advanced software programming skills. To make comparative metagenomics more accessible to microbiologists, we have developed a freely accessible, easy-to-use web server for comparative metagenomic analysis called METAGENassist. Users can upload their bacterial census data from a wide variety of common formats, using either amplified 16S rRNA data or shotgun metagenomic data. Metadata concerning environmental, culture, or host conditions can also be uploaded. During the data upload process, METAGENassist also performs an automated taxonomic-to-phenotypic mapping. Phenotypic information covering nearly 20 functional categories such as GC content, genome size, oxygen requirements, energy sources and preferred temperature range is automatically generated from the taxonomic input data. Using this phenotypically enriched data, users can then perform a variety of multivariate and univariate data analyses including fold change analysis, t-tests, PCA, PLS-DA, clustering and classification. To facilitate data processing, users are guided through a step-by-step analysis workflow using a variety of menus, information hyperlinks and check boxes. METAGENassist also generates colorful, publication quality tables and graphs that can be downloaded and used directly in the preparation of scientific papers. METAGENassist is available at http://www.metagenassist.ca.

[1]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[2]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[3]  Joaquín Dopazo,et al.  GEPAS: a web-based resource for microarray gene expression data analysis , 2003, Nucleic Acids Res..

[4]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[5]  Shibu Yooseph,et al.  Genomic and functional adaptation in surface ocean planktonic prokaryotes , 2010, Nature.

[6]  David S. Wishart,et al.  BacMap: an interactive picture atlas of annotated bacterial genomes , 2004, Nucleic Acids Res..

[7]  R. Knight,et al.  Supervised classification of human microbiota. , 2011, FEMS microbiology reviews.

[8]  Xuegong Zhang,et al.  Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data , 2006, BMC Bioinformatics.

[9]  Robert G. Beiko,et al.  Identifying biologically relevant differences between metagenomic communities , 2010, Bioinform..

[10]  David S. Wishart,et al.  MetaboAnalyst: a web server for metabolomic data analysis and interpretation , 2009, Nucleic Acids Res..

[11]  P. Bork,et al.  Enterotypes of the human gut microbiome , 2011, Nature.

[12]  I-Min A. Chen,et al.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata , 2007, Nucleic Acids Res..

[13]  Mihai Pop,et al.  Statistical Methods for Detecting Differentially Abundant Features in Clinical Metagenomic Samples , 2009, PLoS Comput. Biol..

[14]  David S. Wishart,et al.  BacMap: an up-to-date electronic atlas of annotated bacterial genomes , 2011, Nucleic Acids Res..

[15]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[16]  Daniel H. Huson,et al.  Visual and statistical comparison of metagenomes , 2009, Bioinform..

[17]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[18]  I-Min A. Chen,et al.  The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata , 2011, Nucleic Acids Res..

[19]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[20]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[21]  S. Schuster,et al.  Integrative analysis of environmental sequences using MEGAN4. , 2011, Genome research.

[22]  E. Mardis,et al.  An obesity-associated gut microbiome with increased capacity for energy harvest , 2006, Nature.

[23]  William Stafford Noble,et al.  Support vector machine , 2013 .

[24]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.