Pseudoknots in RNA Secondary Structures: Representation, Enumeration, and Prevalence

A number of non-coding RNA are known to contain functionally important or conserved pseudoknots. However, pseudoknotted structures are more complex than orthodox, and most methods for analyzing secondary structures do not handle them. I present here a way to decompose and represent general secondary structures which extends the tree representation of the stem-loop structure, and use this to analyze the frequency of pseudoknots in known and in random secondary structures. This comparison shows that, though a number of pseudoknots exist, they are still relatively rare and mostly of the simpler kinds. In contrast, random secondary structures tend to be heavily knotted, and the number of available structures increases dramatically when allowing pseudoknots. Therefore, methods for structure prediction and non-coding RNA identification that allow pseudoknots are likely to be much less powerful than those that do not, unless they penalize pseudoknots appropriately.

[1]  A. Zee,et al.  Topological classification of RNA structures. , 2006, Journal of molecular biology.

[2]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..

[3]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[4]  E. Siggia,et al.  Modeling RNA folding paths with pseudoknots: application to hepatitis delta virus ribozyme. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Markus E. Nebel,et al.  Combinatorial Properties of RNA Secondary Structures , 2003, J. Comput. Biol..

[6]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[7]  J. Ng,et al.  PseudoBase: a database with RNA pseudoknots , 2000, Nucleic Acids Res..

[8]  D M Crothers,et al.  Prediction of RNA secondary structure. , 1971, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Christian N. S. Pedersen,et al.  Pseudoknots in RNA Secondary Structures , 2000 .

[10]  A E Willis,et al.  Derivation of a structural model for the c-myc IRES. , 2001, Journal of molecular biology.

[11]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[12]  E. Rodney Canfield Remarks on an Asymptotic Method in Combinatorics , 1984, J. Comb. Theory, Ser. A.

[13]  Yves Van de Peer,et al.  Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences , 2004, Bioinform..

[14]  P. Stadler,et al.  RNA structures with pseudo-knots: Graph-theoretical, combinatorial, and statistical properties , 1999, Bulletin of mathematical biology.

[15]  K. Umesono,et al.  Comparative and functional anatomy of group II catalytic introns--a review. , 1989, Gene.

[16]  T. Schlick,et al.  Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. , 2003, Nucleic acids research.

[17]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[18]  Tamar Schlick,et al.  Modular RNA architecture revealed by computational analysis of existing pseudoknots and ribosomal RNAs , 2005, Nucleic acids research.

[19]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[20]  R. Nussinov,et al.  Tree graphs of RNA secondary structures and their comparisons. , 1989, Computers and biomedical research, an international journal.

[21]  Peter F. Stadler,et al.  Combinatorics of RNA Secondary Structures , 1998, Discret. Appl. Math..

[22]  C. Pleij,et al.  An approximation of loop free energy values of RNA H-pseudoknots. , 1999, RNA.

[23]  Satoshi Kobayashi,et al.  Tree Adjoining Grammars for RNA Structure Prediction , 1999, Theor. Comput. Sci..

[24]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[25]  Elena Rivas,et al.  Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs , 2000, Bioinform..

[26]  M. Waterman Secondary Structure of Single-Stranded Nucleic Acidst , 1978 .

[27]  P. Schuster,et al.  Algorithm independent properties of RNA secondary structure predictions , 1996, European Biophysics Journal.

[28]  John W. Moon,et al.  On an asymptotic method in enumeration , 1989, J. Comb. Theory, Ser. A.

[29]  C. Cantor,et al.  Structure and topology of 16S ribosomal RNA. An analysis of the pattern of psoralen crosslinking. , 1980, Nucleic acids research.

[30]  J. Abrahams,et al.  Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. , 1990, Nucleic acids research.

[31]  Anne Condon,et al.  Classifying RNA pseudoknotted structures , 2004, Theor. Comput. Sci..

[32]  P. Schuster,et al.  Statistics of RNA secondary structures , 1993, Biopolymers.

[33]  G. Viennot,et al.  Enumeration of RNA Secondary Structures by Complexity , 1985 .

[34]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[35]  E. Bender Asymptotic Methods in Enumeration , 1974 .

[36]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[37]  C. Pleij,et al.  A new principle of RNA folding based on pseudoknotting. , 1985, Nucleic acids research.

[38]  A Xayaphoummine,et al.  Prediction and statistics of pseudoknots in RNA structures using exactly clustered stochastic simulations , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[39]  T. Schlick,et al.  Candidates for novel RNA topologies. , 2004, Journal of molecular biology.

[40]  W. Salser,et al.  Computer method for predicting the secondary structure of single-stranded RNA. , 1978, Nucleic acids research.

[41]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..