Detection of Large Numbers of Novel Sequences in the Metatranscriptomes of Complex Marine Microbial Communities

Background Sequencing the expressed genetic information of an ecosystem (metatranscriptome) can provide information about the response of organisms to varying environmental conditions. Until recently, metatranscriptomics has been limited to microarray technology and random cloning methodologies. The application of high-throughput sequencing technology is now enabling access to both known and previously unknown transcripts in natural communities. Methodology/Principal Findings We present a study of a complex marine metatranscriptome obtained from random whole-community mRNA using the GS-FLX Pyrosequencing technology. Eight samples, four DNA and four mRNA, were processed from two time points in a controlled coastal ocean mesocosm study (Bergen, Norway) involving an induced phytoplankton bloom producing a total of 323,161,989 base pairs. Our study confirms the finding of the first published metatranscriptomic studies of marine and soil environments that metatranscriptomics targets highly expressed sequences which are frequently novel. Our alternative methodology increases the range of experimental options available for conducting such studies and is characterized by an exceptional enrichment of mRNA (99.92%) versus ribosomal RNA. Analysis of corresponding metagenomes confirms much higher levels of assembly in the metatranscriptomic samples and a far higher yield of large gene families with >100 members, ∼91% of which were novel. Conclusions/Significance This study provides further evidence that metatranscriptomic studies of natural microbial communities are not only feasible, but when paired with metagenomic data sets, offer an unprecedented opportunity to explore both structure and function of microbial communities – if we can overcome the challenges of elucidating the functions of so many never-seen-before gene families.

[1]  J. Gilbert,et al.  Potential for phosphonoacetate utilization by marine bacteria in temperate coastal waters. , 2009, Environmental microbiology.

[2]  Daniel H. Huson,et al.  Simultaneous Assessment of Soil Microbial Community Structure and Function through Analysis of the Meta-Transcriptome , 2008, PloS one.

[3]  Nigel W. Hardy,et al.  The first RSBI (ISA-TAB) workshop: "can a simple format work for complex studies?". , 2008, Omics : a journal of integrative biology.

[4]  Chris F. Taylor,et al.  The minimum information about a genome sequence (MIGS) specification , 2008, Nature Biotechnology.

[5]  Maureen L. Coleman,et al.  Microbial community gene expression in ocean surface waters , 2008, Proceedings of the National Academy of Sciences.

[6]  Rick L. Stevens,et al.  The RAST Server: Rapid Annotations using Subsystems Technology , 2008, BMC Genomics.

[7]  J. Neufeld,et al.  Stable-isotope probing implicates Methylophaga spp and novel Gammaproteobacteria in marine methanol and methylamine metabolism , 2007, The ISME Journal.

[8]  D. Field,et al.  Large-Scale Comparative Genomic Ranking of Taxonomically Restricted Genes (TRGs) in Bacterial and Archaeal Genomes , 2007, PloS one.

[9]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[10]  S. Kravitz,et al.  CAMERA: A Community Resource for Metagenomics , 2007, PLoS biology.

[11]  Benjamin J. Raphael,et al.  The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.

[12]  P. François,et al.  Comparison of amplification methods for transcriptomic analyses of low abundance prokaryotic RNA sources. , 2007, Journal of microbiological methods.

[13]  Mercedes Moreno-Paz,et al.  Analysis of environmental transcriptomes by DNA microarrays. , 2007, Environmental microbiology.

[14]  T. Urich,et al.  Archaea predominate among ammonia-oxidizing prokaryotes in soils , 2006, Nature.

[15]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[16]  Xiaohua Hu,et al.  Average gene length is highly conserved in prokaryotes and eukaryotes and diverges only between the two kingdoms. , 2006, Molecular biology and evolution.

[17]  G. Church,et al.  Sequencing genomes from single cells by polymerase cloning , 2006, Nature Biotechnology.

[18]  E. Delong,et al.  Community Genomics Among Stratified Microbial Assemblages in the Ocean's Interior , 2006, Science.

[19]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[20]  M. Moran,et al.  Analysis of Microbial Gene Transcripts in Environmental Samples , 2005, Applied and Environmental Microbiology.

[21]  Alexander Souvorov,et al.  The relationship of protein conservation and sequence length , 2002, BMC Evolutionary Biology.

[22]  A. Khodursky,et al.  Isolation of Escherichia coli mRNA and comparison of expression using mRNA and total RNA on DNA microarrays. , 2001, Analytical biochemistry.

[23]  R. Griffiths,et al.  Rapid Method for Coextraction of DNA and RNA from Natural Environments for Analysis of Ribosomal DNA- and rRNA-Based Microbial Community Composition , 2000, Applied and Environmental Microbiology.