Design of an RNA structural motif database

In this paper we present the design and implementation of an RNA structural motif database, called RmotifDB. The structural motifs stored in RmotifDB come from three sources: collected manually from the biomedical literature; submitted by scientists around the world; discovered by a wide variety of motif mining methods. We present here a motif mining method in detail. We also describe the interface and search mechanisms provided by RmotifDB and report its current status. The RmotifDB system is fully operational and accessible on the web at http://datalab.njit.edu/bioinfo/.

[1]  Jun Hu,et al.  A method for aligning RNA secondary structures and its application to RNA motif detection , 2005, BMC Bioinformatics.

[2]  Dennis Shasha,et al.  Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications , 1999 .

[3]  C. Proud,et al.  Regulation of mRNA translation. , 2001, Essays in biochemistry.

[4]  C. Y. Chen,et al.  AU-rich elements: characterization and importance in mRNA degradation. , 1995, Trends in biochemical sciences.

[5]  N. Gray,et al.  Regulation of mRNA translation by 5'- and 3'-UTR-binding factors. , 2003, Trends in biochemical sciences.

[6]  Jason Tsong-Li Wang,et al.  Kernel design for RNA classification using Support Vector Machines , 2006, Int. J. Data Min. Bioinform..

[7]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[8]  R. Duronio,et al.  Histone mRNA expression: multiple levels of cell cycle regulation and important developmental consequences. , 2002, Current opinion in cell biology.

[9]  Graziano Pesole,et al.  UTRdb and UTRsite: specialized databases of sequences and functional elements of 5' and 3' untranslated regions of eukaryotic mRNAs. Update 2002 , 2002, Nucleic Acids Res..

[10]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[11]  Michael Q. Zhang,et al.  Identifying tissue-selective transcription factor binding sites in vertebrate promoters. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Kaizhong Zhang,et al.  RADAR: An InteractiveWeb-Based Toolkit for RNA Data Analysis and Research , 2006, Sixth IEEE Symposium on BioInformatics and BioEngineering (BIBE'06).

[13]  Kaizhong Zhang,et al.  An Algorithm for Finding the Largest Approximately Common Substructures of Two Trees , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  C. Gissi,et al.  Untranslated regions of mRNAs , 2002, Genome Biology.

[15]  Phillip A Sharp,et al.  Predictive Identification of Exonic Splicing Enhancers in Human Genes , 2002, Science.

[16]  G. Stormo,et al.  Discovering common stem-loop motifs in unaligned RNA sequences. , 2001, Nucleic acids research.

[17]  Jon D. McAuliffe,et al.  Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome , 2003, Science.

[18]  Sean R. Eddy,et al.  A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure , 2002, BMC Bioinformatics.

[19]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[20]  S. Tenenbaum,et al.  Eukaryotic mRNPs may represent posttranscriptional operons. , 2002, Molecular cell.

[21]  Dennis Shasha,et al.  New Techniques for DNA Sequence Classification , 1999, J. Comput. Biol..

[22]  Gary D. Stormo,et al.  Phylogenetically enhanced statistical tools for RNA structure prediction , 2000, Bioinform..

[23]  Jason Tsong-Li Wang,et al.  Scientific Data Mining: A Case Study , 1998, Int. J. Softw. Eng. Knowl. Eng..

[24]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[25]  D. Landsman,et al.  Statistical analysis of over-represented words in human promoter sequences. , 2004, Nucleic acids research.

[26]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[27]  S. Eddy,et al.  Computational identification of noncoding RNAs in E. coli by comparative genomics , 2001, Current Biology.

[28]  Graziano Pesole,et al.  PatSearch: a program for the detection of patterns and structural motifs in nucleotide sequences , 2003, Nucleic Acids Res..

[29]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[30]  I. Hofacker,et al.  Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. , 2004, Journal of molecular biology.

[31]  Hannu Toivonen,et al.  Data Mining In Bioinformatics , 2005 .

[32]  Tala Bakheet,et al.  ARED: human AU-rich element-containing mRNA database reveals an unexpectedly diverse functional repertoire of encoded proteins , 2001, Nucleic Acids Res..

[33]  Kaizhong Zhang,et al.  Automated Discovery of Active Motifs in Multiple RNA Secondary Structures , 1996, KDD.

[34]  J. Wilusz,et al.  Bringing the role of mRNA decay in the control of gene expression into focus. , 2004, Trends in genetics : TIG.