A Search for H/ACA SnoRNAs in Yeast Using MFE Secondary Structure Prediction

MOTIVATION Noncoding RNA genes produce functional RNA molecules rather than coding for proteins. One such family is the H/ACA snoRNAs. Unlike the related C/D snoRNAs these have resisted automated detection to date. RESULTS We develop an algorithm to screen the yeast genome for novel H/ACA snoRNAs. To achieve this, we introduce some new methods for facilitating the search for noncoding RNAs in genomic sequences which are based on properties of predicted minimum free-energy (MFE) secondary structures. The algorithm has been implemented and can be generalized to enable screening of other eukaryote genomes. We find that use of primary sequence alone is insufficient for identifying novel H/ACA snoRNAs. Only the use of secondary structure filters reduces the number of candidates to a manageable size. From genomic context, we identify three strong H/ACA snoRNA candidates. These together with a further 47 candidates obtained by our analysis are being experimentally screened.

[1]  S. Altschul,et al.  Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. , 1985, Molecular biology and evolution.

[2]  M. Maurel,et al.  Recent findings in the modern RNA world , 2001, International microbiology : the official journal of the Spanish Society for Microbiology.

[3]  B. Dujon,et al.  Genomic Exploration of the Hemiascomycetous Yeasts: 1. A set of yeast species for molecular evolution studies 1 , 2000, FEBS letters.

[4]  D. Ecker,et al.  RNAMotif, an RNA secondary structure definition and search algorithm. , 2001, Nucleic acids research.

[5]  L Grate,et al.  Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast. , 2000, Nucleic acids research.

[6]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[7]  Vincent Moulton,et al.  Use of RNA Secondary Structure for Studying the Evolution of RNase P and RNase MRP , 2000, Journal of Molecular Evolution.

[8]  H. Margalit,et al.  Novel small RNA-encoding genes in the intergenic regions of Escherichia coli , 2001, Current Biology.

[9]  Dmitry A. Samarsky,et al.  A comprehensive database for the small nucleolar RNAs from Saccharomyces cerevisiae , 1999, Nucleic Acids Res..

[10]  S. Eddy Computational Genomics of Noncoding RNA Genes , 2002, Cell.

[11]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[12]  Elena Rivas,et al.  Noncoding RNA gene detection using comparative sequence analysis , 2001, BMC Bioinformatics.

[13]  M. Fournier,et al.  The small nucleolar RNAs. , 1995, Annual review of biochemistry.

[14]  J. Ni,et al.  Small Nucleolar RNAs Direct Site-Specific Synthesis of Pseudouridine in Ribosomal RNA , 1997, Cell.

[15]  V. Ambros,et al.  An Extensive Class of Small RNAs in Caenorhabditis elegans , 2001, Science.

[16]  J. Ofengand,et al.  The pseudouridine residues of ribosomal RNA. , 1995, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[17]  Elena Rivas,et al.  Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs , 2000, Bioinform..

[18]  A. Krogh,et al.  No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution. , 1999, Nucleic acids research.

[19]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[20]  T. Kiss,et al.  The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation. , 1997, Genes & development.

[21]  I. Bozzoni,et al.  U86, a novel snoRNA with an unprecedented gene organization in yeast. , 2001, Biochemical and biophysical research communications.

[22]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[23]  S. Eddy Non–coding RNA genes and the modern RNA world , 2001, Nature Reviews Genetics.

[24]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[25]  Tamás Kiss,et al.  Site-Specific Pseudouridine Formation in Preribosomal RNA Is Guided by Small Nucleolar RNAs , 1997, Cell.

[26]  S. Eddy,et al.  A computational screen for methylation guide snoRNAs in yeast. , 1999, Science.

[27]  T. Tuschl,et al.  Identification of Novel Genes Coding for Small Expressed RNAs , 2001, Science.

[28]  G. Storz,et al.  Identification of novel small RNAs using comparative genomics and microarrays. , 2001, Genes & development.

[29]  N. Kenmochi,et al.  Gene organization and sequence of the region containing the ribosomal protein genes RPL13A and RPS11 in the human genome and conserved features in the mouse genome. , 1999, Gene.

[30]  S. Eddy,et al.  Computational identification of noncoding RNAs in E. coli by comparative genomics , 2001, Current Biology.

[31]  A. Hüttenhofer,et al.  RNomics: an experimental approach that identifies 201 candidates for novel, small, non‐messenger RNAs in mouse , 2001, The EMBO journal.

[32]  T. Graves,et al.  Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis. , 2001, Genome research.

[33]  I-Min A. Dubchak,et al.  A computational approach to identify genes for functional RNAs in genomic sequences. , 2001, Nucleic acids research.

[34]  Carl Tim Kelley,et al.  Iterative methods for optimization , 1999, Frontiers in applied mathematics.

[35]  Paulien Hogeweg,et al.  Energy directed folding of RNA sequences , 1984, Nucleic Acids Res..

[36]  Michael Zuker,et al.  Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide , 1999 .

[37]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[38]  F. Cecconi,et al.  Comparative Structure Analysis of Vertebrate U17 Small Nucleolar RNA (snoRNA) , 2002, Journal of Molecular Evolution.

[39]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[40]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[41]  J. Steitz,et al.  Guided tours: from precursor snoRNA to functional snoRNP. , 1999, Current opinion in cell biology.

[42]  B. Reinhart,et al.  Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA , 2000, Nature.

[43]  L. Lim,et al.  An Abundant Class of Tiny RNAs with Probable Regulatory Roles in Caenorhabditis elegans , 2001, Science.

[44]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[45]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[46]  S. Salzberg,et al.  Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[47]  Maciej Szymanski,et al.  The non-coding RNAs as riboregulators , 2001, Nucleic Acids Res..

[48]  Mike A. Steel,et al.  Metrics on RNA Secondary Structures , 2000, J. Comput. Biol..