Advances in Bioinformatics and Computational Biology

RNA prediction has long been struggling with long-range base pairs since prediction accuracy decreases with base pair span. We analyze here the empirical distribution of base pair spans in large collection of experimentally known RNA structures. Surprisingly, we find that long-range base pairs are overrepresented in these data. In particular, there is no evidence that long-range base pairs are systematically overpredicted relative to short-range interactions in thermodynamic predictions. This casts doubt on a recent suggestion that kinetic effects are the cause of length-dependent decrease of predictability. Instead of a modification of the energy model we advocate a modification of the expected accuracy model for RNA secondary structures. We demonstrate that the inclusion of a span-dependent penalty leads to improved maximum expected accuracy structure predictions compared to both the standard MEA model and a modified folding algorithm with an energy penalty function. The prevalence of long-range base pairs provide further evidence that RNA structures in general do not have the so-called polymer zeta property. This has consequences for the asymptotic performance for a large class of sparsified RNA folding algorithms.

[1]  Yasuo Tabei,et al.  A fast structural multiple alignment method for long RNA sequences , 2008, BMC Bioinformatics.

[2]  S. Brenner,et al.  RNA structural motifs: building blocks of a modular biomolecule , 2005, Quarterly Reviews of Biophysics.

[3]  Cheryl L. Baird,et al.  Molecular basis of the structural stability of a Top7-based scaffold at extreme pH and temperature conditions. , 2010, Journal of molecular graphics & modelling.

[4]  M. Sternberg,et al.  Modelling protein docking using shape complementarity, electrostatics and biochemical information. , 1997, Journal of molecular biology.

[5]  N. Sinha,et al.  Electrostatics in protein binding and function. , 2002, Current protein & peptide science.

[6]  Ioannis Xenarios,et al.  R-Coffee: a web server for accurately aligning noncoding RNA sequences , 2008, Nucleic Acids Res..

[7]  Jan Gorodkin,et al.  Multiple structural alignment and clustering of RNA sequences , 2007, Bioinform..

[8]  C. Ramos,et al.  The use of circular dichroism spectroscopy to study protein folding, form and function , 2009 .

[9]  Igor Jurisica,et al.  Online Predicted Human Interaction Database , 2005, Bioinform..

[10]  N. Sinha,et al.  Differences in electrostatic properties at antibody-antigen binding sites: implications for specificity and cross-reactivity. , 2002, Biophysical journal.

[11]  Steven E. Brenner,et al.  SCOR: a Structural Classification of RNA database , 2002, Nucleic Acids Res..

[12]  Yasuo Tabei,et al.  Murlet: a practical multiple alignment tool for structural RNA sequences , 2007, Bioinform..

[13]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[14]  S. Eddy Non–coding RNA genes and the modern RNA world , 2001, Nature Reviews Genetics.

[15]  Daniel Svozil,et al.  Efficient RNA pairwise structure comparison by SETTER method , 2012, Bioinform..

[16]  Petr Cech,et al.  SETTER: web server for RNA structure comparison , 2012, Nucleic Acids Res..

[17]  Chin Lung Lu,et al.  SARSA: a web tool for structural alignment of RNA using a structural alphabet , 2008, Nucleic Acids Res..

[18]  S. Jones,et al.  Principles of protein-protein interactions. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J. Janin,et al.  Elusive affinities , 1995, Proteins.

[20]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[21]  S. Holbrook Structural principles from large RNAs. , 2008, Annual review of biophysics.

[22]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[23]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[24]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.