Asymptotics of Canonical RNA Secondary Structures

It is a classical result of Stein and Waterman that the asymptotic number S(n) of RNA secondary structures is 1.104366 * pow(n,-3/2) * pow(2.61803,n),where the combinatorial model of RNA concerns a length n homopolymer, such that any base can pair with any other base, subject to the usual convention that hairpin loops must contain at least = 1 unpaired bases. The result of Stein and Waterman is proved by developing recursions,using generating functions and applying Bender's theorem. These recursions form the basis to compute the minimum free energy secondary structure for a given RNA sequence, with respect to the Nussinov energy model, later extended by Zuker to substantially more complicated resursions for the Turner nearest neighbor energy model.In this paper, we study combinatorial asymptotic for two special subclasses of RNA secondary structures - canonical and saturated structures.Canonical secondary structures are defined to have no lonely (isolated) base pairs. This class of secondary structures was introduced b y Bompfuenewerer et al., who noted that the runtime of Vienna RNA Package is substantially decreased when restricting computations to canonical structures. Here we provide an explanation for the speed-up, by proving that the asymptotic number of canonical RNA secondary structures is2.1614 * pow(n,-3/2) * pow(1.96798,n), a result obtained using different methods by Hofacker et al. Saturated secondary structures have the property that no base pairs can be added without violating the definition of secondary structure (i.e. introducing a pseudoknotor base triple). In the Nussinov energy model,where the energy for a base pair is -1, saturated structures correspond to kinetic traps.n prior work, we showed that the asymptotic number of saturated structures of a length n homopolymer is 1.07427 * pow(n,-3/2) * pow(2.35467,n). Here we determine the asymptotic expected number of base pairs in (quasi-) random saturated structures.

[1]  J. Mackenzie Sequential Filling of a Line by Intervals Placed at Random and Its Application to Linear Adsorption , 1962 .

[2]  Henry D. Friedman An Unfriendly Seating Arrangement (Dave Freedman) , 1964 .

[3]  J. Kingman Subadditive Ergodic Theory , 1973 .

[4]  E. Bender Asymptotic Methods in Enumeration , 1974 .

[5]  Michael S. Waterman,et al.  On some new sequences generalizing the Catalan and Motzkin numbers , 1979, Discret. Math..

[6]  R. Nussinov,et al.  Fast algorithm for predicting the secondary structure of single-stranded RNA. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[8]  Philippe Flajolet,et al.  Singularity Analysis of Generating Functions , 1990, SIAM J. Discret. Math..

[9]  Christos H. Papadimitriou,et al.  Elements of the Theory of Computation , 1997, SIGA.

[10]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[11]  A. Denise,et al.  Random generation of words of context-free languages according to the frequencies of letters , 2000 .

[12]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[13]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[14]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[15]  P. Clote,et al.  Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. , 2005, RNA.

[16]  K.C. Wiese,et al.  jViz.Rna -a java tool for RNA secondary structure visualization , 2005, IEEE Transactions on NanoBioscience.

[17]  Peter Clote,et al.  Combinatorics of Saturated Secondary Structures of RNA , 2006, J. Comput. Biol..

[18]  Peter Clote,et al.  Asymptotics of RNA Shapes , 2008, J. Comput. Biol..

[19]  Rolf Backofen,et al.  Variations on RNA folding and alignment: lessons from Benasque , 2007, Journal of mathematical biology.