Profile-based detection of microRNA precursors in animal genomes

MOTIVATION MicroRNAs (miRNA) are essential 21-22 nt regulatory RNAs produced from larger hairpin-like precursors. Local sequence alignment tools such as BLAST are able to identify new members of known miRNA families, but not all of them. We set out to estimate how many new miRNAs could be recovered using a profile-based strategy such as that implemented in the ERPIN program. RESULTS We constructed alignments for 18 miRNA families and performed ERPIN searches on animal genomes. Results were compared to those of a WU-BLAST search at the same E-value cutoff. The two combined approaches produced 265 new miRNA candidates that were not found in miRNA databases. About 17% of hits were ERPIN specific. They showed better structural characteristics than BLAST-specific hits and included interesting candidates such as members of the miR-17 cluster in Tetraodon. Profile-based RNA detection will be an important complement of similarity search programs in the completion of miRNA collections.

[1]  C. Croce,et al.  Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Jorja G. Henikoff,et al.  Using substitution probabilities to improve position-specific scoring matrices , 1996, Comput. Appl. Biosci..

[3]  G. Rubin,et al.  Computational identification of Drosophila microRNA genes , 2003, Genome Biology.

[4]  G. Dreyfuss,et al.  Numerous microRNPs in neuronal cells containing novel microRNAs. , 2003, RNA.

[5]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[6]  D. Haussler,et al.  Stochastic context-free grammars for modeling RNA , 1993, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[7]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[8]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[9]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[10]  B. Reinhart,et al.  The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans , 2000, Nature.

[11]  Peter F Stadler,et al.  Molecular evolution of a microRNA cluster. , 2004, Journal of molecular biology.

[12]  D. Gautheret,et al.  Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. , 2001, Journal of molecular biology.

[13]  Daniel Gautheret,et al.  The ERPIN server: an interface to profile-based RNA motif identification , 2004, Nucleic Acids Res..

[14]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[15]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[16]  C. Burge,et al.  The microRNAs of Caenorhabditis elegans. , 2003, Genes & development.

[17]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[18]  R. Russell,et al.  bantam Encodes a Developmentally Regulated microRNA that Controls Cell Proliferation and Regulates the Proapoptotic Gene hid in Drosophila , 2003, Cell.

[19]  V. Ambros,et al.  An Extensive Class of Small RNAs in Caenorhabditis elegans , 2001, Science.

[20]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.

[21]  T. Tuschl,et al.  Identification of Novel Genes Coding for Small Expressed RNAs , 2001, Science.

[22]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.