Computational integration of genomic traits into 16S rDNA microbiota sequencing studies.

Molecular sequencing techniques help to understand microbial biodiversity with regard to species richness, assembly structure and function. In this context, available methods are barcoding, metabarcoding, genomics and metagenomics. The first two are restricted to taxonomic assignments, whilst genomics only refers to functional capabilities of a single organism. Metagenomics by contrast yields information about organismal and functional diversity of a community. However currently it is very demanding regarding labour and costs and thus not applicable to most laboratories. Here, we show in a proof-of-concept that computational approaches are able to retain functional information about microbial communities assessed through 16S rDNA (meta)barcoding by referring to reference genomes. We developed an automatic pipeline to show that such integration may infer preliminary or supplementary genomic content of a community. We applied it to two biological datasets and delineated significantly overrepresented protein families between communities. The script alongside supporting data is available at http://bioapps.biozentrum.uni-wuerzburg.de.

[1]  N. Loman,et al.  Defining bacterial species in the genomic era: insights from the genus Acinetobacter , 2012, BMC Microbiology.

[2]  Kranti Konganti,et al.  De Novo Assembly of the Streptomyces sp. Strain Mg1 Genome Using PacBio Single-Molecule Sequencing , 2013, Genome Announcements.

[3]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[4]  I-Min A. Chen,et al.  IMG/M: the integrated metagenome data management and comparative analysis system , 2011, Nucleic Acids Res..

[5]  Natalia N. Ivanova,et al.  A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea , 2009, Nature.

[6]  Rolf Apweiler,et al.  InterProScan - an integration platform for the signature-recognition methods in InterPro , 2001, Bioinform..

[7]  Stephan Frickenhaus,et al.  Average genome size: a potential source of bias in comparative metagenomics , 2010, The ISME Journal.

[8]  I. Steffan‐Dewenter,et al.  Diverse Microbiota Identified in Whole Intact Nest Chambers of the Red Mason Bee Osmia bicornis (Linnaeus 1758) , 2013, PloS one.

[9]  M. Rödel,et al.  Common ancestry or environmental trait filters: cross-continental comparisons of trait–habitat relationships in tropical anuran amphibian assemblages , 2012 .

[10]  E. Kuramae,et al.  Taxonomical and functional microbial community selection in soybean rhizosphere , 2014, The ISME Journal.

[11]  Campbell O. Webb,et al.  Phylogenies and Community Ecology , 2002 .

[12]  Erko Stackebrandt,et al.  Taxonomic Note: A Place for DNA-DNA Reassociation and 16S rRNA Sequence Analysis in the Present Species Definition in Bacteriology , 1994 .

[13]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[14]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[15]  M. Begon,et al.  Ecology: From Individuals to Ecosystems , 2005 .

[16]  S. Schuster,et al.  Integrative analysis of environmental sequences using MEGAN4. , 2011, Genome research.

[17]  P. Chain,et al.  Next generation sequencing and bioinformatic bottlenecks: the current state of metagenomic data analysis. , 2012, Current opinion in biotechnology.

[18]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[19]  R. Stepanauskas,et al.  Single-Cell Genomics Reveals Organismal Interactions in Uncultivated Marine Protists , 2011, Science.

[20]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[21]  Aaron A. Klammer,et al.  Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data , 2013, Nature Methods.

[22]  J. Chun,et al.  EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences. , 2007, International journal of systematic and evolutionary microbiology.

[23]  J. Clarridge,et al.  Impact of 16S rRNA Gene Sequence Analysis for Identification of Bacteria on Clinical Microbiology and Infectious Diseases , 2004, Clinical Microbiology Reviews.

[24]  S. Bandyopadhyay,et al.  DNA barcoding to map the microbial communities: current advances and future directions , 2014, Applied Microbiology and Biotechnology.

[25]  Jeremy R. deWaard,et al.  Biological identifications through DNA barcodes , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[26]  J. Banfield,et al.  Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.

[27]  P. Hugenholtz,et al.  Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes , 2013, Nature Biotechnology.

[28]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[29]  Elaine R. Mardis,et al.  A decade’s perspective on DNA sequencing technology , 2011, Nature.

[30]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[31]  Alice Carolyn McHardy,et al.  Taxonomic binning of metagenome samples generated by next-generation sequencing technologies , 2012, Briefings Bioinform..

[32]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[33]  P. Taberlet,et al.  DNA barcoding for ecologists. , 2009, Trends in ecology & evolution.

[34]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[35]  Philip Hugenholtz,et al.  A renaissance for the pioneering 16S rRNA gene. , 2008, Current opinion in microbiology.

[36]  N. Blüthgen,et al.  Composition of epiphytic bacterial communities differs on petals and leaves. , 2011, Plant biology.

[37]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[38]  B. Enquist,et al.  Rebuilding community ecology from functional traits. , 2006, Trends in ecology & evolution.

[39]  Hui-Hsien Chou,et al.  DNA sequence quality trimming and vector removal , 2001, Bioinform..

[40]  Didier Raoult,et al.  16S Ribosomal DNA Sequence Analysis of a Large Collection of Environmental and Clinical Unidentifiable Bacterial Isolates , 2000, Journal of Clinical Microbiology.

[41]  E. Stackebrandt Taxonomic parameters revisited : tarnished gold standards , 2006 .

[42]  Lu Wang,et al.  The NIH Human Microbiome Project. , 2009, Genome research.

[43]  Hirokazu Chiba,et al.  MBGD update 2013: the microbial genome database for exploring the diversity of microbial world , 2012, Nucleic Acids Res..

[44]  Rick L. Stevens,et al.  Meeting Report: The Terabase Metagenomics Workshop and the Vision of an Earth Microbiome Project , 2010, Standards in genomic sciences.

[45]  M. Ferrer,et al.  Metagenomics as a new technological tool to gain scientific knowledge , 2009 .

[46]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[47]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[48]  Florent E. Angly,et al.  Comparative Metagenomics Reveals Host Specific Metavirulomes and Horizontal Gene Transfer Elements in the Chicken Cecum Microbiome , 2008, PloS one.

[49]  Victor Markowitz,et al.  High-resolution metagenomics targets specific functional types in complex microbial communities , 2008, Nature Biotechnology.

[50]  Alla Lapidus,et al.  A Bioinformatician's Guide to Metagenomics , 2008, Microbiology and Molecular Biology Reviews.

[51]  K. Linsenmair,et al.  The importance of environmental heterogeneity for species diversity and assemblage structure in Bornean stream frogs. , 2009, The Journal of animal ecology.

[52]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[53]  J. Crawford,et al.  Merging chemical ecology with bacterial genome mining for secondary metabolite discovery , 2014, Journal of Industrial Microbiology & Biotechnology.

[54]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[55]  Damian Szklarczyk,et al.  eggNOG v4.0: nested orthology inference across 3686 organisms , 2013, Nucleic Acids Res..

[56]  Paul A. Keddy,et al.  Assembly and response rules: two goals for predictive community ecology , 1992 .

[57]  Fabian Schreiber,et al.  CoMet—a web server for comparative functional profiling of metagenomes , 2011, Nucleic Acids Res..

[58]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[59]  Rob Knight,et al.  Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill , 2014, The ISME Journal.

[60]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[61]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[62]  Jonathan A Eisen,et al.  Environmental Shotgun Sequencing: Its Potential and Challenges for Studying the Hidden World of Microbes , 2007, PLoS biology.

[63]  Andrés Moya,et al.  Estimating the extent of horizontal gene transfer in metagenomic sequences , 2008, BMC Genomics.

[64]  R. Sani,et al.  Molecular Techniques to Assess Microbial Community Structure, Function, and Dynamics in the Environment , 2011 .