P-dcfold or How to Predict all Kinds of Pseudoknots in Rna Secondary Structures

Pseudoknots play important roles in many RNAs. But for computational reasons, pseudoknots are usually excluded from the definition of RNA secondary structures. Indeed, prediction of pseudoknots increase very highly the complexities in time of the algorithms, knowing that all existing algorithms for RNA secondary structure prediction have complexities at least of O(n3). Some algorithms have been developed for searching pseudoknots, but all of them have very high complexities, and consider generally particular kinds of pseudoknots. We present an algorithm, called P-DCFold based on the comparative approach, for the prediction of RNA secondary structures including all kinds of pseudoknots. The helices are searched recursively using the "Divide and Conquer" approach, searching the helices from the "most significant" to the "less significant". A selected helix subdivide the sequence into two sub-sequences, the internal one and a concatenation of the two externals. This approach is used to search non-interleaved helices and allows to limit the space of searching. To search for pseudoknots, the processing is reiterated. Therefore, each helix of the pseudoknot is selected in a different step. P-DCFold has been applied to several RNA sequences. In less than two seconds, their respective secondary structures, including their pseudoknots, have been recovered very efficiently.

[1]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[2]  C. Zwieb,et al.  Comparative sequence analysis of tmRNA. , 1999, Nucleic acids research.

[3]  G. Stormo,et al.  Discovering common stem-loop motifs in unaligned RNA sequences. , 2001, Nucleic acids research.

[4]  D. Draper,et al.  Thermodynamics of folding a pseudoknotted mRNA fragment. , 1994, Journal of molecular biology.

[5]  D. Draper,et al.  Unusual mRNA pseudoknot structure is recognized by a protein translational repressor , 1989, Cell.

[6]  M Brown,et al.  RNA pseudoknot modeling using intersections of stochastic context free grammars with applications to database search. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[7]  N. Pace,et al.  The varieties of ribonuclease P. , 1992, Trends in biochemical sciences.

[8]  Norman R. Pace,et al.  Ribonuclease P: function and variation. , 1990, The Journal of biological chemistry.

[9]  Christian Zwieb,et al.  SRPDB: Signal Recognition Particle Database , 2003, Nucleic Acids Res..

[10]  C. Haslinger Prediction Algorithms for Restricted RNA Pseudoknots , 2001 .

[11]  Maciej Szymanski,et al.  5S Ribosomal RNA Database , 2002, Nucleic Acids Res..

[12]  Christian Zwieb,et al.  SRPDB (Signal Recognition Particle Database) , 2001, Nucleic Acids Res..

[13]  Christian Zwieb The uRNA database , 1997, Nucleic Acids Res..

[14]  R De Wachter,et al.  RnaViz, a program for the visualisation of RNA secondary structure. , 1997, Nucleic acids research.

[15]  J. Ng,et al.  PseudoBase: a database with RNA pseudoknots , 2000, Nucleic Acids Res..

[16]  D. Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[17]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[18]  D. Draper,et al.  Evidence for allosteric coupling between the ribosome and repressor binding sites of a translationally regulated mRNA. , 1990, Biochemistry.

[19]  Christian Zwieb,et al.  SURVEY AND SUMMARY Comparative sequence analysis of tmRNA , 1999 .

[20]  C. Pleij,et al.  tRNA‐like structures , 1991 .

[21]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[22]  R. Haselkorn,et al.  SECONDARY STRUCTURE IN RIBONUCLEIC ACIDS. , 1959, Proceedings of the National Academy of Sciences of the United States of America.

[23]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[24]  Gary D. Stormo,et al.  An RNA folding method capable of identifying pseudoknots and base triples , 1998, Bioinform..

[25]  Christian N. S. Pedersen,et al.  Pseudoknots in RNA Secondary Structures , 2000 .

[26]  P. Schimmel RNA pseudoknots that interact with components of the translation apparatus , 1989, Cell.

[27]  Steven Skiena,et al.  Designing RNA structures: natural and artificial selection , 2002, RECOMB '02.

[28]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2000, Nucleic Acids Res..

[29]  Gary D. Stormo,et al.  A Phylogenetic Approach to RNA Structure Prediction , 1999, ISMB.

[30]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2003, Nucleic Acids Res..

[31]  Gary D. Stormo,et al.  Graph-Theoretic Approach to RNA Modeling Using Comparative Data , 1995, ISMB.

[32]  Mireille Régnier,et al.  Automatic RNA Secondary Structure Prediction with a Comparative Approach , 2002, Comput. Chem..

[33]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[34]  P. Stadler,et al.  RNA structures with pseudo-knots: Graph-theoretical, combinatorial, and statistical properties , 1999, Bulletin of mathematical biology.

[35]  James W. Brown The ribonuclease P database , 1997, Nucleic Acids Res..

[36]  Laurie J. Heyer,et al.  Finding the most significant common sequence and structure motifs in a set of RNA sequences. , 1997, Nucleic acids research.

[37]  S. Altman,et al.  Recent studies of ribonuclease P , 1993, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[38]  Michael Zuker,et al.  Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide , 1999 .

[39]  J. F. Atkins,et al.  Probing the structure of the Escherichia coli 10Sa RNA (tmRNA). , 1997, RNA.

[40]  C. Pleij,et al.  An APL-programmed genetic algorithm for the prediction of RNA secondary structure. , 1995, Journal of theoretical biology.