Genome-wide search for coaxial helical stacking motifs

Motif finding in DNA, RNA and proteins plays an important role in life science research. In this paper, we present a computational approach to searching for RNA tertiary motifs in genomic sequences. Specifically, we describe a method, named CSminer, and show, as a case study, the application of CSminer to genome-wide search for coaxial helical stackings in RNA 3-way junctions. A coaxial helical stacking motif occurs in an RNA 3-way junction where two separate helical elements form a pseudocontiguous helix and provide thermodynamic stability to the RNA molecule as a whole. Experimental results demonstrate the effectiveness of our approach.

[1]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[2]  Eric Westhof,et al.  Sequence to Structure (S2S): display, manipulate and interconnect RNA data from sequence to structure , 2005, Bioinform..

[3]  Jeremy Buhler,et al.  Designing Filters for Fast-Known NcRNA Identification , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  J. M. Diamond,et al.  Fluorescence Competition and Optical Melting Measurements of RNA Three-Way Multibranch Loops Provide a Revised Model for Thermodynamic Parameters† , 2010, Biochemistry.

[5]  W. Gilbert Origin of life: The RNA world , 1986, Nature.

[6]  Ziding Zhang,et al.  Predicting Residue-Residue Contacts and Helix-Helix Interactions in Transmembrane Proteins Using an Integrative Feature-Based Random Forest Approach , 2011, PloS one.

[7]  Robert Giegerich,et al.  Local similarity in RNA secondary structures , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[8]  D. P. Aalberts,et al.  Asymmetry in RNA pseudoknots: observation and theory , 2005, Nucleic acids research.

[9]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[10]  C. Vonrhein,et al.  Structure of the 30S ribosomal subunit , 2000, Nature.

[11]  T. Schlick,et al.  Analysis of four-way junctions in RNA structures. , 2009, Journal of molecular biology.

[12]  A. T. Vasconcelos,et al.  Identification of non-coding RNAs in environmental vibrios. , 2010, Microbiology.

[13]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[14]  J. M. Diamond,et al.  Thermodynamics of three-way multibranch loops in RNA. , 2001, Biochemistry.

[15]  João Maroco,et al.  Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests , 2011, BMC Research Notes.

[16]  T. Schlick,et al.  Predicting coaxial helical stacking in RNA junctions , 2011, Nucleic acids research.

[17]  A. E. Walter,et al.  Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[18]  A. R. Srinivasan,et al.  The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. , 1992, Biophysical journal.

[19]  T. Steitz,et al.  The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. , 2000, Science.

[20]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[21]  Jia Liu,et al.  Sequence-dependent prediction of recombination hotspots in Saccharomyces cerevisiae. , 2012, Journal of theoretical biology.

[22]  N. Seeman,et al.  The general structure of transfer RNA molecules. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Alfonso Mondragón,et al.  Emerging structural themes in large RNA molecules. , 2011, Current opinion in structural biology.

[24]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[25]  Yann Ponty,et al.  VARNA: Interactive drawing and editing of the RNA secondary structure , 2009, Bioinform..

[26]  Zheng Fang,et al.  Identification of microRNA precursors based on random forest with network-level representation method of stem-loop structure , 2011, BMC Bioinformatics.

[27]  Anne-Laure Boulesteix,et al.  AUC-RF: A New Strategy for Genomic Profiling with Random Forest , 2011, Human Heredity.

[28]  F. Schluenzen,et al.  Structure of Functionally Activated Small Ribosomal Subunit at 3.3 Å Resolution , 2000, Cell.

[29]  Anna Marie Pyle,et al.  Crystal Structure of a Self-Spliced Group II Intron , 2008, Science.

[30]  Constantin F. Aliferis,et al.  A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification , 2008, BMC Bioinformatics.

[31]  A. E. Walter,et al.  Thermodynamics of coaxially stacked helixes with GA and CC mismatches. , 1996, Biochemistry.

[32]  T. Nilsen,et al.  Reprogramming of the non-coding transcriptome during brain development , 2010, Journal of biology.

[33]  T. Schlick,et al.  Annotation of tertiary interactions in RNA structures reveals variations and correlations. , 2008, RNA.

[34]  Bo-Suk Yang,et al.  Random forests classifier for machine fault diagnosis , 2008 .

[35]  E. Westhof,et al.  The building blocks and motifs of RNA architecture. , 2006, Current opinion in structural biology.

[36]  A Yonath,et al.  Structure of functionally activated small ribosomal subunit at 3.3 angstroms resolution. , 2000, Cell.

[37]  T. Schlick,et al.  Tertiary motifs revealed in analyses of higher-order RNA junctions. , 2009, Journal of molecular biology.

[38]  E. Westhof,et al.  Topology of three-way junctions in folded RNAs. , 2006, RNA.

[39]  Nagarajan Nandagopal,et al.  A two-length-scale polymer theory for RNA loop free energies and helix stacking. , 2010, RNA.

[40]  John D. Westbrook,et al.  Tools for the automatic identification and classification of RNA base pairs , 2003, Nucleic Acids Res..

[41]  Adjacent Nucleotide Dependence in ncRNA and Order-1 SCFG for ncRNA Identification , 2010, PloS one.

[42]  A. Pyle,et al.  The ever-growing complexity of nucleic acids: from small DNA and RNA motifs to large molecular assemblies and machines. , 2011, Current opinion in structural biology.

[43]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .