Metagenomic microbial community profiling using unique clade-specific marker genes

Metagenomic shotgun sequencing data can identify microbes populating a microbial community and their proportions, but existing taxonomic profiling methods are inefficient for increasingly large data sets. We present an approach that uses clade-specific marker genes to unambiguously assign reads to microbial clades more accurately and >50× faster than current approaches. We validated our metagenomic phylogenetic analysis tool, MetaPhlAn, on terabases of short reads and provide the largest metagenomic profiling to date of the human gut. It can be accessed at http://huttenhower.sph.harvard.edu/metaphlan/.

[1]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[2]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[3]  Edward F. DeLong,et al.  Microbial community genomics in the ocean , 2005, Nature Reviews Microbiology.

[4]  R. Daniel The metagenomics of soil , 2005, Nature Reviews Microbiology.

[5]  J. Marrazzo,et al.  Molecular identification of bacteria associated with bacterial vaginosis. , 2005, The New England journal of medicine.

[6]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[7]  BMC Bioinformatics , 2005 .

[8]  I. Rigoutsos,et al.  Accurate phylogenetic classification of variable-length DNA fragments , 2007, Nature Methods.

[9]  A. Salamov,et al.  Use of simulated data sets to evaluate the fidelity of metagenomic processing methods , 2007, Nature Methods.

[10]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[11]  Gail L. Rosen,et al.  Metagenome Fragment Classification Using N-Mer Frequency Profiles , 2008, Adv. Bioinformatics.

[12]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[13]  J. Eisen,et al.  A simple, fast, and accurate method of phylogenomic inference , 2008, Genome Biology.

[14]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences: current status, policy and new initiatives , 2008, Nucleic Acids Res..

[15]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[16]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[17]  James R. Cole,et al.  The Ribosomal Database Project: improved alignments and new tools for rRNA analysis , 2008, Nucleic Acids Res..

[18]  S. Salzberg,et al.  Phymm and PhymmBL: Metagenomic Phylogenetic Classification with Interpolated Markov Models , 2009, Nature Methods.

[19]  B. Roe,et al.  A core gut microbiome in obese and lean twins , 2008, Nature.

[20]  P. Gajer,et al.  Vaginal microbiome of reproductive-age women , 2010, Proceedings of the National Academy of Sciences.

[21]  Steven H. Hinrichs,et al.  RAIphy: Phylogenetic classification of metagenomics samples using iterative refinement of relative abundance index profiles , 2011, BMC Bioinformatics.

[22]  Wendy S. Garrett,et al.  Bifidobacterium animalis subsp. lactis fermented milk product reduces inflammation by altering a niche for colitogenic microbes , 2010, Proceedings of the National Academy of Sciences.

[23]  P. Bork,et al.  A human gut microbial gene catalogue established by metagenomic sequencing , 2010, Nature.

[24]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[25]  I-Min A. Chen,et al.  The integrated microbial genomes system: an expanding comparative analysis resource , 2009, Nucleic Acids Res..

[26]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[27]  D. Ussery,et al.  Genomic comparisons of Brucella spp. and closely related bacteria using base compositional and proteome based methods , 2010, BMC Evolutionary Biology.

[28]  Alexandros Stamatakis,et al.  Aligning short reads to reference alignments and trees , 2011, Bioinform..

[29]  Robert G. Beiko,et al.  Classifying short genomic fragments from novel lineages using composition and homology , 2011, BMC Bioinformatics.

[30]  Gail L. Rosen,et al.  NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads , 2010, Bioinform..

[31]  Curtis Huttenhower,et al.  Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies , 2011, PloS one.

[32]  J. Stoye,et al.  Taxonomic classification of metagenomic shotgun sequences with CARMA3 , 2011, Nucleic acids research.

[33]  Elena Marchiori,et al.  MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks , 2010, Bioinform..

[34]  S. Schuster,et al.  Integrative analysis of environmental sequences using MEGAN4. , 2011, Genome research.

[35]  S. Salzberg,et al.  PhymmBL expanded: confidence scores, custom databases, parallelization and more , 2011, Nature Methods.

[36]  P. Bork,et al.  Enterotypes of the human gut microbiome , 2011, Nature.

[37]  Siu-Ming Yiu,et al.  A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio , 2011, Bioinform..

[38]  T. Scheffer,et al.  Taxonomic metagenome sequence assignment with structured output models , 2011, Nature Methods.

[39]  Katherine H. Huang,et al.  A framework for human microbiome research , 2012, Nature.

[40]  E. Delong,et al.  Microbial metatranscriptomics in a permanent marine oxygen minimum zone. , 2012, Environmental microbiology.