Identification of candidate structured RNAs in the marine organism 'Candidatus Pelagibacter ubique'

BackgroundMetagenomic sequence data are proving to be a vast resource for the discovery of biological components. Yet analysis of this data to identify functional RNAs lags behind efforts to characterize protein diversity. The genome of 'Candidatus Pelagibacter ubique' HTCC 1062 is the closest match for approximately 20% of marine metagenomic sequence reads. It is also small, contains little non-coding DNA, and has strikingly low GC content.ResultsTo aid the discovery of RNA motifs within the marine metagenome we exploited the genomic properties of 'Cand. P. ubique' by targeting our search to long intergenic regions (IGRs) with relatively high GC content. Analysis of known RNAs (rRNA, tRNA, riboswitches etc.) shows that structured RNAs are significantly enriched in such IGRs. To identify additional candidate structured RNAs, we examined other IGRs with similar characteristics from 'Cand. P. ubique' using comparative genomics approaches in conjunction with marine metagenomic data. Employing this strategy, we discovered four candidate structured RNAs including a new riboswitch class as well as three additional likely cis-regulatory elements that precede genes encoding ribosomal proteins S2 and S12, and the cytoplasmic protein component of the signal recognition particle. We also describe four additional potential RNA motifs with few or no examples occurring outside the metagenomic data.ConclusionThis work begins the process of identifying functional RNA motifs present in the metagenomic data and illustrates how existing completed genomes may be used to aid in this task.

[1]  Dileep K. Pulukkunat,et al.  Deciphering RNA structural diversity and systematic phylogeny from microbial metagenomes , 2007, Nucleic acids research.

[2]  C. Chothia,et al.  Volume changes in protein evolution. , 1994, Journal of molecular biology.

[3]  C. Gualerzi,et al.  Translation initiation factor 3 antagonizes authentic start codon selection on leaderless mRNAs , 1999, Molecular microbiology.

[4]  P. Cahill,et al.  Interaction of Escherichia coli ribosomal protein S8 with its binding sites in ribosomal RNA and messenger RNA. , 1988, Journal of molecular biology.

[5]  R. Batey,et al.  Structures of regulatory elements in mRNAs. , 2006, Current opinion in structural biology.

[6]  L. Scott,et al.  Interaction of the Bacillus stearothermophilus ribosomal protein S15 with its 5'-translational operator mRNA. , 2001, Journal of molecular biology.

[7]  M. Pop,et al.  Metagenomic Analysis of the Human Distal Gut Microbiome , 2006, Science.

[8]  Jörg Vogel,et al.  Experimental approaches to identify non-coding RNAs , 2006, Nucleic acids research.

[9]  Stephen J. Callister,et al.  Proteomic Analysis of Stationary Phase In , 2008 .

[10]  Frédéric Partensky,et al.  Accelerated evolution associated with genome reduction in a free-living prokaryote , 2005, Genome Biology.

[11]  P. Focia,et al.  Heterodimeric GTPase Core of the SRP Targeting Complex , 2004, Science.

[12]  M. Nomura,et al.  Ribosomal protein S4 acts in trans as a translational repressor to regulate expression of the alpha operon in Escherichia coli , 1982, Journal of bacteriology.

[13]  Zasha Weinberg,et al.  CMfinder - a covariance model based RNA motif finding algorithm , 2006, Bioinform..

[14]  Sam Griffiths-Jones,et al.  RALEE--RNA ALignment Editor in Emacs , 2005, Bioinform..

[15]  Natalia Ivanova,et al.  Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities , 2006, Nature Biotechnology.

[16]  S. Tishchenko,et al.  Ribosomal protein L1 recognizes the same specific structural motif in its target sites on the autoregulatory mRNA and 23S rRNA , 2005, Nucleic acids research.

[17]  M. Pool Signal recognition particles in chloroplasts, bacteria, yeast and mammals (Review) , 2005, Molecular membrane biology.

[18]  M. Nomura,et al.  E. coli ribosomal protein L4 is a feedback regulatory protein , 1980, Cell.

[19]  R. Breaker,et al.  Unique glycine-activated riboswitch linked to glycine-serine auxotrophy in SAR11. , 2009, Environmental microbiology.

[20]  R. Breaker,et al.  Regulation of bacterial gene expression by riboswitches. , 2005, Annual review of microbiology.

[21]  S. Eddy Computational Genomics of Noncoding RNA Genes , 2002, Cell.

[22]  Natalia N. Ivanova,et al.  Symbiosis insights through metagenomic analysis of a microbial consortium. , 2006, Nature Reviews Microbiology.

[23]  I. Boni,et al.  A new regulatory circuit in ribosomal protein operons: S2-mediated control of the rpsB-tsf expression in vivo. , 2008, RNA.

[24]  S. Eddy,et al.  Computational identification of noncoding RNAs in E. coli by comparative genomics , 2001, Current Biology.

[25]  L. Shapiro,et al.  tmRNAs that encode proteolysis-inducing tags are found in all known bacterial genomes: A two-piece tmRNA functions in Caulobacter. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[26]  M. Nomura,et al.  Translational regulation of the spc operon in Escherichia coli. Identification and structural analysis of the target site for S8 repressor protein. , 1988, Journal of molecular biology.

[27]  Hilla Peretz,et al.  Ju n 20 03 Schrödinger ’ s Cat : The rules of engagement , 2003 .

[28]  T. Henkin,et al.  The rpsD gene, encoding ribosomal protein S4, is autogenously regulated in Bacillus subtilis , 1991, Journal of bacteriology.

[29]  H. Margalit,et al.  A survey of small RNA-encoding genes in Escherichia coli. , 2003, Nucleic acids research.

[30]  John B. Anderson,et al.  CDD: a Conserved Domain Database for protein classification , 2004, Nucleic Acids Res..

[31]  G. Kramer,et al.  Structure, Function, and Genetics of Ribosomes , 1986, Springer Series in Molecular Biology.

[32]  W. Szer,et al.  Ribosomal protein S1 and polypeptide chain initiation in bacteria. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[33]  R. Stroud,et al.  Substrate twinning activates the signal recognition particle and its receptor , 2004, Nature.

[34]  M. Springer,et al.  Translational feedback regulation of the gene for L35 in Escherichia coli requires binding of ribosomal protein L20 to two sites in its leader mRNA: a possible case of ribosomal RNA-messenger RNA molecular mimicry. , 2002, RNA.

[35]  Robert Giegerich,et al.  RNAshapes: an integrated RNA analysis package based on abstract shapes. , 2006, Bioinformatics.

[36]  Jeffrey E. Barrick,et al.  Evidence for a second class of S-adenosylmethionine riboswitches and other regulatory RNA motifs in alpha-proteobacteria , 2005, Genome Biology.

[37]  E. Mardis,et al.  An obesity-associated gut microbiome with increased capacity for energy harvest , 2006, Nature.

[38]  R. Batey,et al.  Structure of the SAM-II riboswitch bound to S-adenosylmethionine , 2008, Nature Structural &Molecular Biology.

[39]  Zasha Weinberg,et al.  The aptamer core of SAM-IV riboswitches mimics the ligand-binding site of SAM-I riboswitches. , 2008, RNA.

[40]  Jimin Wang,et al.  The structure of a ribosomal protein S8/spc operon mRNA complex. , 2004, RNA.

[41]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[42]  Ronald R. Breaker,et al.  Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression , 2002, Nature.

[43]  P. Bork,et al.  Get the most out of your metagenome: computational analysis of environmental sequence data. , 2007, Current opinion in microbiology.

[44]  T. Takagi,et al.  MetaGene: prokaryotic gene finding from environmental genome shotgun sequences , 2006, Nucleic acids research.

[45]  R. Breaker,et al.  Riboswitches that sense S-adenosylmethionine and S-adenosylhomocysteine. , 2008, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[46]  S. Brown,et al.  Effect of 4.5S RNA depletion on Escherichia coli protein synthesis and secretion , 1994, Journal of bacteriology.

[47]  Benjamin J. Raphael,et al.  The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.

[48]  G. Phillips,et al.  Functional Analysis of the Signal Recognition Particle in Escherichia coli by Characterization of a Temperature-Sensitive ffh Mutant , 2002, Journal of bacteriology.

[49]  D. Patel,et al.  RNA-structural Mimicry in Escherichia coli Ribosomal Protein L4-dependent Regulation of the S10 Operon* , 2003, Journal of Biological Chemistry.

[50]  R. Gourse,et al.  Control of Ribosome Synthesis in Escherichia coli , 1986 .

[51]  V. Ramakrishnan,et al.  Crystal structure of the 30 S ribosomal subunit from Thermus thermophilus: structure of the proteins and their interactions with 16 S RNA. , 2002, Journal of molecular biology.

[52]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[53]  M. Grunberg‐Manago,et al.  Target site of Escherichia coli ribosomal protein S15 on its messenger RNA. Conformation and interaction with the protein. , 1990, Journal of molecular biology.

[54]  Natalia N. Ivanova,et al.  Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite , 2007, Nature.

[55]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[56]  P. Schattner Searching for RNA genes using base-composition statistics. , 2002, Nucleic acids research.

[57]  P. Dennis,et al.  Autogenous control: ribosomal protein L10‐L12 complex binds to the leader sequence of its mRNA. , 1982, The EMBO journal.

[58]  J. Banfield,et al.  Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.

[59]  L. Lindahl,et al.  Protein L4 of the E. coli ribosome regulates an eleven gene r protein operon , 1980, Cell.

[60]  Hiroshi Mori,et al.  Comparative Metagenomics Revealed Commonly Enriched Gene Sets in Human Gut Microbiomes , 2007, DNA research : an international journal for rapid publication of reports on genes and genomes.

[61]  L. Lindahl,et al.  Diverse mechanisms for regulating ribosomal protein synthesis in Escherichia coli. , 1994, Progress in nucleic acid research and molecular biology.

[62]  A. Travers Control of ribosome synthesis , 1975, Nature.

[63]  G. An,et al.  Organization and nucleotide sequence of a new ribosomal operon in Escherichia coli containing the genes for ribosomal protein S2 and elongation factor Ts. , 1981, Nucleic acids research.

[64]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2001, Nucleic Acids Res..

[65]  M. Kaczanowska,et al.  Ribosome Biogenesis and the Translation Process in Escherichia coli , 2007, Microbiology and Molecular Biology Reviews.

[66]  I. Boni,et al.  A key role for the mRNA leader structure in translational control of ribosomal protein S1 synthesis in gamma-proteobacteria. , 2003, Nucleic acids research.

[67]  M. Gelfand,et al.  Abundance and functional diversity of riboswitches in microbial communities , 2007, BMC Genomics.

[68]  S. Kravitz,et al.  CAMERA: A Community Resource for Metagenomics , 2007, PLoS biology.

[69]  E. Kolker,et al.  Transcriptome analysis of Escherichia coli using high-density oligonucleotide probe arrays. , 2002, Nucleic acids research.

[70]  Maureen L. Coleman,et al.  Microbial community gene expression in ocean surface waters , 2008, Proceedings of the National Academy of Sciences.

[71]  S. Patankar,et al.  A screen for conserved sequences with biased base composition identifies noncoding RNAs in the A-T rich genome of Plasmodium falciparum. , 2005, Molecular and biochemical parasitology.

[72]  E. Delong,et al.  Community Genomics Among Stratified Microbial Assemblages in the Ocean's Interior , 2006, Science.

[73]  E. Westhof,et al.  RNA structure: bioinformatic analysis. , 2007, Current opinion in microbiology.

[74]  J. Doudna,et al.  Crystal structure of the ribonucleoprotein core of the signal recognition particle. , 2000, Science.

[75]  Jeffrey E. Barrick,et al.  The distributions, mechanisms, and structures of metabolite-binding riboswitches , 2007, Genome Biology.

[76]  R. Sauer,et al.  The tmRNA system for translational surveillance and ribosome rescue. , 2007, Annual review of biochemistry.

[77]  S. Altman A view of RNase P. , 2007, Molecular bioSystems.

[78]  M. Gelfand,et al.  Comparative Genomics of Thiamin Biosynthesis in Procaryotes , 2002, The Journal of Biological Chemistry.

[79]  Shibu Yooseph,et al.  Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering , 2007, BMC Bioinformatics.

[80]  S. Pedersen,et al.  Concentrations of 4.5S RNA and Ffh protein in Escherichia coli: the stability of Ffh protein is dependent on the concentration of 4.5S RNA , 1994, Journal of bacteriology.

[81]  H. Aiba Mechanism of RNA silencing by Hfq-binding small RNAs. , 2007, Current opinion in microbiology.

[82]  M. Noordewier,et al.  Genome Streamlining in a Cosmopolitan Oceanic Bacterium , 2005, Science.

[83]  M. Nomura,et al.  Post-transcriptional regulation of the str operon in Escherichia coli. Ribosomal protein S7 inhibits coupled translation of S7 but not its independent translation. , 1994, Journal of molecular biology.

[84]  N. Pace,et al.  Bacterial RNase P: a new view of an ancient enzyme , 2006, Nature Reviews Microbiology.

[85]  Pontus Larsson,et al.  De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: performance of Markov-dependent genome feature scoring. , 2008, Genome research.

[86]  S. Altuvia Identification of bacterial small non-coding RNAs: experimental approaches. , 2007, Current opinion in microbiology.

[87]  Sean R. Eddy,et al.  RSEARCH: Finding homologs of single structured RNA sequences , 2003, BMC Bioinformatics.

[88]  S. Eddy,et al.  Noncoding RNA genes identified in AT-rich hyperthermophiles , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[89]  Zasha Weinberg,et al.  A Glycine-Dependent Riboswitch That Uses Cooperative Binding to Control Gene Expression , 2004, Science.

[90]  Shane J. Neph,et al.  Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline , 2007, Nucleic acids research.

[91]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[92]  L. Lindahl,et al.  Transcription of the s10 ribosomal protein operon is regulated by an attenuator in the leader , 1983, Cell.

[93]  A. Salamov,et al.  Use of simulated data sets to evaluate the fidelity of metagenomic processing methods , 2007, Nature Methods.

[94]  J. Doudna,et al.  Structural insights into the signal recognition particle. , 2003, Annual review of biochemistry.

[95]  Gene W. Tyson,et al.  Metatranscriptomics reveals unique microbial small RNAs in the ocean’s water column , 2009, Nature.

[96]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[97]  William A. Siebold,et al.  SAR11 clade dominates ocean surface bacterioplankton communities , 2002, Nature.

[98]  C. Yanofsky,et al.  Regulation by transcription attenuation in bacteria: how RNA provides instructions for transcription termination/antitermination decisions. , 2002, BioEssays : news and reviews in molecular, cellular and developmental biology.

[99]  I-Min A. Chen,et al.  IMG/M: a data management and analysis system for metagenomes , 2007, Nucleic Acids Res..

[100]  Zasha Weinberg,et al.  Sequence-based heuristics for faster annotation of non-coding RNA families , 2006, Bioinform..