Integrating Chemical Footprinting Data into RNA Secondary Structure Prediction

Chemical and enzymatic footprinting experiments, such as shape (selective 2′-hydroxyl acylation analyzed by primer extension), yield important information about RNA secondary structure. Indeed, since the -hydroxyl is reactive at flexible (loop) regions, but unreactive at base-paired regions, shape yields quantitative data about which RNA nucleotides are base-paired. Recently, low error rates in secondary structure prediction have been reported for three RNAs of moderate size, by including base stacking pseudo-energy terms derived from shape data into the computation of minimum free energy secondary structure. Here, we describe a novel method, RNAsc (RNA soft constraints), which includes pseudo-energy terms for each nucleotide position, rather than only for base stacking positions. We prove that RNAsc is self-consistent, in the sense that the nucleotide-specific probabilities of being unpaired in the low energy Boltzmann ensemble always become more closely correlated with the input shape data after application of RNAsc. From this mathematical perspective, the secondary structure predicted by RNAsc should be ‘correct’, in as much as the shape data is ‘correct’. We benchmark RNAsc against the previously mentioned method for eight RNAs, for which both shape data and native structures are known, to find the same accuracy in 7 out of 8 cases, and an improvement of 25% in one case. Furthermore, we present what appears to be the first direct comparison of shape data and in-line probing data, by comparing yeast asp-tRNA shape data from the literature with data from in-line probing experiments we have recently performed. With respect to several criteria, we find that shape data appear to be more robust than in-line probing data, at least in the case of asp-tRNA.

[1]  Howard Y. Chang,et al.  Genome-wide measurement of RNA secondary structure in yeast , 2010, Nature.

[2]  Stefan Washietl,et al.  Sequence and structure analysis of noncoding RNAs. , 2010, Methods in molecular biology.

[3]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[4]  Ye Ding,et al.  Sfold web server for statistical folding and rational design of nucleic acids , 2004, Nucleic Acids Res..

[5]  D. Mathews,et al.  Accurate SHAPE-directed RNA structure determination , 2009, Proceedings of the National Academy of Sciences.

[6]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[7]  David H. Mathews,et al.  RNAstructure: software for RNA secondary structure prediction and analysis , 2010, BMC Bioinformatics.

[8]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[11]  Jeffrey E. Barrick,et al.  Riboswitches Control Fundamental Biochemical Pathways in Bacillus subtilis and Other Bacteria , 2003, Cell.

[12]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[13]  Gary D. Stormo,et al.  An RNA folding method capable of identifying pseudoknots and base triples , 1998, Bioinform..

[14]  K. Weeks,et al.  SHAPE analysis of long-range interactions reveals extensive and thermodynamically preferred misfolding in a fragile group I intron RNA. , 2008, Biochemistry.

[15]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[16]  A. Laederach,et al.  Evaluation of the information content of RNA structure mapping data for secondary structure prediction. , 2010, RNA.

[17]  K. Weeks,et al.  Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE): quantitative RNA structure analysis at single nucleotide resolution , 2006, Nature Protocols.

[18]  Michael J. E. Sternberg,et al.  Secondary structure prediction: Current Opinion in Structural Biology 1992, 2:237–241 , 1992 .

[19]  D. Turner,et al.  Thermal unfolding of a group I ribozyme: the low-temperature transition is primarily disruption of tertiary structure. , 1993, Biochemistry.

[20]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[21]  Morgan C. Giddings,et al.  High-Throughput SHAPE Analysis Reveals Structures in HIV-1 Genomic RNA Strongly Conserved across Distinct Biological States , 2008, PLoS biology.

[22]  Catherine A. Wakeman,et al.  Structure and Mechanism of a Metal-Sensing Regulatory RNA , 2007, Cell.

[23]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[24]  R R Breaker,et al.  Relationship between internucleotide linkage geometry and the stability of RNA. , 1999, RNA.

[25]  [Evaluation of the information content of thermographic signs of the breasts using a computer]. , 1981, Meditsinskaia radiologiia.

[26]  Rhiju Das,et al.  A mutate-and-map strategy accurately infers the base pairs of a 35-nucleotide model RNA. , 2011, RNA.

[27]  Rhiju Das,et al.  Understanding the errors of SHAPE-directed RNA structure modeling. , 2011, Biochemistry.

[28]  室 章治郎 Michael R.Garey/David S.Johnson 著, "COMPUTERS AND INTRACTABILITY A guide to the Theory of NP-Completeness", FREEMAN, A5判変形判, 338+xii, \5,217, 1979 , 1980 .

[29]  Karissa Y. Sanbonmatsu,et al.  Structural architecture of the human long non-coding RNA, steroid receptor RNA activator , 2012, Nucleic acids research.

[30]  David H. Mathews,et al.  Predicting a set of minimal free energy RNA secondary structures common to two sequences , 2005, Bioinform..

[31]  Cole Trapnell,et al.  Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq) , 2011, Proceedings of the National Academy of Sciences.

[32]  David H. Mathews,et al.  NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure , 2009, Nucleic Acids Res..

[33]  R. Altman,et al.  SAFA: semi-automated footprinting analysis software for high-throughput quantification of nucleic acid footprinting experiments. , 2005, RNA.

[34]  T. Steitz,et al.  The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. , 2000, Science.

[35]  K. Weeks,et al.  RNA structure analysis at single nucleotide resolution by selective 2'-hydroxyl acylation and primer extension (SHAPE). , 2005, Journal of the American Chemical Society.

[36]  Ralf Bundschuh,et al.  Modeling the interplay of single-stranded binding proteins and nucleic acid secondary structure , 2010, Bioinform..

[37]  Higgs Overlaps between RNA secondary structures. , 1996, Physical review letters.

[38]  References , 1971 .

[39]  Kevin M Weeks,et al.  RNA SHAPE chemistry reveals nonhierarchical interactions dominate equilibrium structural transitions in tRNA(Asp) transcripts. , 2005, Journal of the American Chemical Society.

[40]  D Thirumalai,et al.  Assembly mechanisms of RNA pseudoknots are determined by the stabilities of constituent secondary structures , 2009, Proceedings of the National Academy of Sciences.

[41]  H. Al‐Hashimi,et al.  Topology Links RNA Secondary Structure with Global Conformation, Dynamics, and Adaptation , 2010, Science.

[42]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[43]  D. Mathews,et al.  ProbKnot: fast prediction of RNA secondary structure including pseudoknots. , 2010, RNA.

[44]  D. Mathews Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. , 2004, RNA.

[45]  Adam Roth,et al.  Confirmation of a second natural preQ1 aptamer class in Streptococcaceae bacteria. , 2008, RNA.

[46]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[47]  S. Beaucage,et al.  Current Protocols in Nucleic Acid Chemistry , 1999 .

[48]  Ching Wai Tan,et al.  Secondary structure prediction , 2005 .