Toward an Integrated RNA Motif Database

In this paper we present the design and implementation of an RNA structural motif database, called RmotifDB. The structural motifs stored in RmotifDB come from three sources: (1) collected manually from biomedical literature; (2) submitted by scientists around the world; and (3) discovered by a wide variety of motif mining methods. We present here a motif mining method in detail. We also describe the interface and search mechanisms provided by RmotifDB as well as techniques used to integrate RmotifDB with the Gene Ontology. The RmotifDB system is fully operational and accessible on the Internet at http://datalab.njit.edu/bioinfo/

[1]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[2]  Kaizhong Zhang,et al.  RADAR: An InteractiveWeb-Based Toolkit for RNA Data Analysis and Research , 2006, Sixth IEEE Symposium on BioInformatics and BioEngineering (BIBE'06).

[3]  B. Shapiro,et al.  RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. , 2006, RNA.

[4]  Val Tannen,et al.  K2/Kleisli and GUS: Experiments in integrated access to genomic data sources , 2001, IBM Syst. J..

[5]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[6]  Dennis Shasha,et al.  New Techniques for DNA Sequence Classification , 1999, J. Comput. Biol..

[7]  Jason Tsong-Li Wang,et al.  Kernel design for RNA classification using Support Vector Machines , 2006, Int. J. Data Min. Bioinform..

[8]  Graziano Pesole,et al.  UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs , 2004, Nucleic Acids Res..

[9]  G. Stormo,et al.  Discovering common stem-loop motifs in unaligned RNA sequences. , 2001, Nucleic acids research.

[10]  K. Katz,et al.  Introducing RefSeq and LocusLink: curated human genome resources at the NCBI. , 2000, Trends in genetics : TIG.

[11]  Eckart Bindewald,et al.  CorreLogo: an online server for 3D sequence logos of RNA and DNA alignments , 2006, Nucleic Acids Res..

[12]  Susan B. Davidson,et al.  A User-Centric Framework for Accessing Biological Sources and Tools , 2005, DILS.

[13]  Kaizhong Zhang,et al.  An Algorithm for Finding the Largest Approximately Common Substructures of Two Trees , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Sean R. Eddy,et al.  A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure , 2002, BMC Bioinformatics.

[15]  Peter Dalgaard,et al.  Introductory statistics with R , 2002, Statistics and computing.

[16]  Tala Bakheet,et al.  ARED: human AU-rich element-containing mRNA database reveals an unexpectedly diverse functional repertoire of encoded proteins , 2001, Nucleic Acids Res..

[17]  Jun Hu,et al.  A method for aligning RNA secondary structures and its application to RNA motif detection , 2005, BMC Bioinformatics.

[18]  Graziano Pesole,et al.  PatSearch: a program for the detection of patterns and structural motifs in nucleotide sequences , 2003, Nucleic Acids Res..

[19]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[20]  Gary D. Stormo,et al.  Phylogenetically enhanced statistical tools for RNA structure prediction , 2000, Bioinform..