Dynamic Programming Algorithm for the Density of States of RNA Secondary Structures

OF STATES OF RNA SECONDARY STRUCTURES Jan Cupal 1, Ivo L. Hofacker+ and Peter F. Stadler y Theoretische Biochemie @ Institut f ur Theoretische Chemie Universitat Wien, Wahringerstra e 17, A-1090 Wien, Austria +Beckman Institute, University of Illinois, Urbana, IL 61801, U.S.A. yThe Santa Fe Institute, 1399 Hyde Park Rd. Santa Fe, NM 87501, U.S.A. A dynamic programming algorithm for the computation of the complete density of states of RNA secondary structures is presented. CPU and memory requirements scale as n3m2 and n2m, respectively, where n is the chain length and m is the number of energy bins. The Vienna RNA Package [4; 5] is an e cient implementation developed for the computation of the minimum free energy structure [8], sets of suboptimal structures [7], and the complete matrix of base pairing probabilities [6]. All these algorithms are based on a common dynamic programming scheme. The density of states, i.e., the energy distribution of suboptimal secondary structures, is of utmost importance for an understanding of the structural versatility of RNA molecules [3]. Higgs found that d.o.s. of evolved sequences such as tRNAs di er signi cantly from random RNA sequences. His studies were based, however, on a non-recursive algorithm using a drastically simpli ed energy model for the RNA secondary structures [2]. We show here that the same dynamic programming scheme that underlies all folding algorithms can be extended to a rigorous computation of the complete density of states. The key observation is that the d.o.s. of a subsequence [i; j] can be computed recursively from the d.o.s. of all shorter subsequences contained in [i; j], see the box below for details. The algorithm is based on the standard energy model for RNA secondary structures, see e.g. [1]. All code is written in Ansi C.