A comprehensive metatranscriptome analysis pipeline and its validation using human small intestine microbiota datasets

BackgroundNext generation sequencing (NGS) technologies can be applied in complex microbial ecosystems for metatranscriptome analysis by employing direct cDNA sequencing, which is known as RNA sequencing (RNA-seq). RNA-seq generates large datasets of great complexity, the comprehensive interpretation of which requires a reliable bioinformatic pipeline. In this study, we focus on the development of such a metatranscriptome pipeline, which we validate using Illumina RNA-seq datasets derived from the small intestine microbiota of two individuals with an ileostomy.ResultsThe metatranscriptome pipeline developed here enabled effective removal of rRNA derived sequences, followed by confident assignment of the predicted function and taxonomic origin of the mRNA reads. Phylogenetic analysis of the small intestine metatranscriptome datasets revealed a strong similarity with the community composition profiles obtained from 16S rDNA and rRNA pyrosequencing, indicating considerable congruency between community composition (rDNA), and the taxonomic distribution of overall (rRNA) and specific (mRNA) activity among its microbial members. Reproducibility of the metatranscriptome sequencing approach was established by independent duplicate experiments. In addition, comparison of metatranscriptome analysis employing single- or paired-end sequencing methods indicated that the latter approach does not provide improved functional or phylogenetic insights. Metatranscriptome functional-mapping allowed the analysis of global, and genus specific activity of the microbiota, and illustrated the potential of these approaches to unravel syntrophic interactions in microbial ecosystems.ConclusionsA reliable pipeline for metatransciptome data analysis was developed and evaluated using RNA-seq datasets obtained for the human small intestine microbiota. The set-up of the pipeline is very generic and can be applied for (bacterial) metatranscriptome analysis in any chosen niche.

[1]  James R. Cole,et al.  The Ribosomal Database Project: improved alignments and new tools for rRNA analysis , 2008, Nucleic Acids Res..

[2]  Ron Unger,et al.  Composition bias and the origin of ORFan genes , 2010, Bioinform..

[3]  Rob Knight,et al.  Comparison of Illumina paired-end and single-direction sequencing for microbial 16S rRNA gene amplicon surveys , 2011, The ISME Journal.

[4]  E. Claas,et al.  High-Throughput Identification of Bacteria and Yeast by Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry in Conventional Medical Microbiology Laboratories , 2010, Journal of Clinical Microbiology.

[5]  N. R. Murphy,et al.  Improved nucleic acid organic extraction through use of a unique gel barrier material. , 1996, BioTechniques.

[6]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[7]  Alvis Brazma,et al.  A pipeline for RNA-seq data processing and quality assessment , 2011, Bioinform..

[8]  W. Holben,et al.  Linking bacterial identities and ecosystem processes: can 'omic' analyses be more than the sum of their parts? , 2011, FEMS microbiology ecology.

[9]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[10]  Alison S. Waller,et al.  Genomic variation landscape of the human gut microbiome , 2012, Nature.

[11]  Thomas M. Schmidt,et al.  rrnDB: documenting the number of rRNA and tRNA genes in bacteria and archaea , 2008, Nucleic Acids Res..

[12]  W. D. de Vos,et al.  Genetic Diversity of Viable, Injured, and Dead Fecal Bacteria Assessed by Fluorescence-Activated Cell Sorting and 16S rRNA Gene Analysis , 2005, Applied and Environmental Microbiology.

[13]  Michael D. Wilson,et al.  ChIP-seq: using high-throughput sequencing to discover protein-DNA interactions. , 2009, Methods.

[14]  Akiyasu C. Yoshizawa,et al.  KAAS: an automatic genome annotation and pathway reconstruction server , 2007, Environmental health perspectives.

[15]  Matthias Hess,et al.  A perspective: metatranscriptomics as a tool for the discovery of novel biocatalysts. , 2009, Journal of biotechnology.

[16]  Peer Bork,et al.  iPath2.0: interactive pathway explorer , 2011, Nucleic Acids Res..

[17]  Milkha M. Leimena,et al.  Functional Intestinal Metagenomics , 2011 .

[18]  K. Schleifer,et al.  The domain-specific probe EUB338 is insufficient for the detection of all Bacteria: development and evaluation of a more comprehensive probe set. , 1999, Systematic and applied microbiology.

[19]  E. Delong,et al.  Quantitative Analysis of Small-Subunit rRNA Genes in Mixed Microbial Populations via 5′-Nuclease Assays , 2000, Applied and Environmental Microbiology.

[20]  E. Zoetendal,et al.  Isolation of RNA from bacterial samples of the human gastrointestinal tract , 2006, Nature Protocols.

[21]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[22]  Brian D. Ondov,et al.  Structure and Complexity of a Bacterial Transcriptome , 2009, Journal of bacteriology.

[23]  Eric Westhof,et al.  The amazing world of bacterial structured RNAs , 2010, Genome Biology.

[24]  T. Schmidt,et al.  rRNA Operon Copy Number Reflects Ecological Strategies of Bacteria , 2000, Applied and Environmental Microbiology.

[25]  R. Knight,et al.  Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex , 2008, Nature Methods.

[26]  Margaret C. Linak,et al.  Sequence-specific error profile of Illumina sequencers , 2011, Nucleic acids research.

[27]  B. Haas,et al.  Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. , 2011, Genome research.

[28]  Jack A Gilbert,et al.  Gene expression profiling: metatranscriptomics. , 2011, Methods in molecular biology.

[29]  Daniel H. Huson,et al.  Simultaneous Assessment of Soil Microbial Community Structure and Function through Analysis of the Meta-Transcriptome , 2008, PloS one.

[30]  P. Bork,et al.  A human gut microbial gene catalogue established by metagenomic sequencing , 2010, Nature.

[31]  E M Rubin,et al.  Mapping the Burkholderia cenocepacia niche response via high-throughput sequencing , 2009, Proceedings of the National Academy of Sciences.

[32]  E. Zoetendal,et al.  Diversity of human small intestinal Streptococcus and Veillonella populations. , 2013, FEMS microbiology ecology.

[33]  P. Mazière,et al.  Impact of RNA degradation on gene expression profiles: assessment of different methods to reliably determine RNA quality. , 2007, Journal of biotechnology.

[34]  R. Sooknanan,et al.  Novel methods for rRNA removal and directional, ligation-free RNA-seq library preparation , 2010 .

[35]  R. Sorek,et al.  Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity , 2010, Nature Reviews Genetics.

[36]  John Parkinson,et al.  Generation and Analysis of a Mouse Intestinal Metatranscriptome through Illumina Based RNA-Sequencing , 2012, PloS one.

[37]  Peer Bork,et al.  The human small intestinal microbiota is driven by rapid uptake and conversion of simple carbohydrates , 2012, The ISME Journal.

[38]  Milkha M. Leimena,et al.  Comparative Analysis of Lactobacillus plantarum WCFS1 Transcriptomes by Using DNA Microarray and Next-Generation Sequencing Technologies , 2012, Applied and Environmental Microbiology.

[39]  P. Brigidi,et al.  Metagenomics: Key to Human Gut Microbiota , 2011, Digestive Diseases.

[40]  Hélène Touzet,et al.  SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data , 2012, Bioinform..

[41]  A. Moya,et al.  The Active Human Gut Microbiota Differs from the Total Microbiota , 2011, PloS one.

[42]  John Parkinson,et al.  The global landscape of sequence diversity , 2007, Genome Biology.

[43]  Miguel Pignatelli,et al.  Metatranscriptomic Approach to Analyze the Functional Human Gut Microbiota , 2011, PloS one.

[44]  Michiel Kleerebezem,et al.  Isolation of DNA from bacterial samples of the human gastrointestinal tract , 2006, Nature Protocols.

[45]  E. Zoetendal,et al.  Microarray Analysis and Barcoded Pyrosequencing Provide Consistent Microbial Profiles Depending on the Source of Human Intestinal Samples , 2011, Applied and Environmental Microbiology.

[46]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[47]  Bernard Henrissat,et al.  Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins , 2010, Proceedings of the National Academy of Sciences.

[48]  J. Gilbert,et al.  Detection of Large Numbers of Novel Sequences in the Metatranscriptomes of Complex Marine Microbial Communities , 2008, PloS one.

[49]  W. Ludwig,et al.  SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB , 2007, Nucleic acids research.

[50]  W. D. de Vos,et al.  Linking phylogenetic identities of bacteria to starch fermentation in an in vitro model of the large intestine by RNA-based stable isotope probing. , 2009, Environmental microbiology.

[51]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[52]  E. Zoetendal,et al.  Microbial communities in the human small intestine: coupling diversity to metagenomics. , 2007, Future microbiology.

[53]  Michiel Kleerebezem,et al.  High temporal and inter-individual variation detected in the human ileal microbiota. , 2010, Environmental microbiology.

[54]  Maureen L. Coleman,et al.  Microbial community gene expression in ocean surface waters , 2008, Proceedings of the National Academy of Sciences.

[55]  Katherine H. Huang,et al.  Efficient and robust RNA-seq process for cultured bacteria and complex community transcriptomes , 2012, Genome Biology.

[56]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[57]  J. Kopecký,et al.  Active and total microbial communities in forest soil are largely different and highly stratified during decomposition , 2011, The ISME Journal.