The energy-spectrum of bicompatible sequences

Background Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a given RNA structure, play an important role in the context of neutral evolution. Sequences that are simultaneously compatible with two given structures (bicompatible sequences), are beacons in phenotypic transitions, induced by erroneously replicating populations of RNA sequences. RNA riboswitches, which are capable of expressing two distinct secondary structures without changing the underlying sequence, are one example of bicompatible sequences in living organisms. Results We present a full loop energy model Boltzmann sampler of bicompatible sequences for pairs of structures. The sequence sampler employs a dynamic programming routine whose time complexity is polynomial when assuming the maximum number of exposed vertices, $$\kappa $$ κ , is a constant. The parameter $$\kappa $$ κ depends on the two structures and can be very large. We introduce a novel topological framework encapsulating the relations between loops that sheds light on the understanding of $$\kappa $$ κ . Based on this framework, we give an algorithm to sample sequences with minimum $$\kappa $$ κ on a particular topologically classified case as well as giving hints to the solution in the other cases. As a result, we utilize our sequence sampler to study some established riboswitches. Conclusion Our analysis of riboswitch sequences shows that a pair of structures needs to satisfy key properties in order to facilitate phenotypic transitions and that pairs of random structures are unlikely to do so. Our analysis observes a distinct signature of riboswitch sequences, suggesting a new criterion for identifying native sequences and sequences subjected to evolutionary pressure. Our free software is available at: https://github.com/FenixHuang667/Bifold .

[1]  Hans L. Bodlaender,et al.  A linear time algorithm for finding tree-decompositions of small treewidth , 1993, STOC.

[2]  Akito Taneda Multi-objective optimization for RNA design with multiple target secondary structures , 2015, BMC Bioinformatics.

[3]  R. Montange,et al.  Structure of a natural guanine-responsive riboswitch complexed with the metabolite hypoxanthine , 2004, Nature.

[4]  Carsten Wiuf,et al.  Fatgraph models of proteins , 2009, 0902.1025.

[5]  P. Schuster,et al.  Generic properties of combinatory maps: neutral networks of RNA secondary structures. , 1997, Bulletin of mathematical biology.

[6]  P. Schuster,et al.  Genotypes with phenotypes: adventures in an RNA toy world. , 1997, Biophysical chemistry.

[7]  A. Zee,et al.  Topological classification of RNA structures. , 2006, Journal of molecular biology.

[8]  William S. Massey,et al.  Algebraic Topology: An Introduction , 1977 .

[9]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[10]  R. C. Penner,et al.  Topological classification and enumeration of RNA structures by genus , 2013, Journal of mathematical biology.

[11]  Christian M. Reidys,et al.  Topological language for RNA , 2016, Mathematical biosciences.

[12]  Christian M Reidys,et al.  Central and local limit theorems for RNA structures. , 2007, Journal of theoretical biology.

[13]  Christian M. Reidys,et al.  Evolutionary Dynamics and Optimization: Neutral Networks as Model-Landscapes for RNA Secondary-Structure Folding-Landscapes , 1995, ECAL.

[14]  P. Schuster,et al.  Analysis of RNA sequence structure maps by exhaustive enumeration II. Structures of neutral networks and shape space covering , 1996 .

[15]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[16]  C. Lawrence,et al.  A statistical sampling algorithm for RNA secondary structure prediction. , 2003, Nucleic acids research.

[17]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  D. W. Staple,et al.  Open access, freely available online Primer Pseudoknots: RNA Structures with Diverse Functions , 2022 .

[19]  Rex A. Dwyer,et al.  RNA Secondary Structure , 2002 .

[20]  G. Vernizzi,et al.  LargeN Random Matrices for RNA Folding , 2005 .

[21]  Christoph Flamm,et al.  RNAblueprint: flexible multiple target nucleic acid sequence design , 2017, Bioinform..

[22]  Rumen Andonov,et al.  Maximum Contact Map Overlap Revisited , 2011, J. Comput. Biol..

[23]  Weinberger,et al.  RNA folding and combinatory landscapes. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[24]  Catherine A. Wakeman,et al.  Structure and Mechanism of a Metal-Sensing Regulatory RNA , 2007, Cell.

[25]  Andrea L Edwards,et al.  Riboswitches: structures and mechanisms. , 2011, Cold Spring Harbor perspectives in biology.

[26]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[27]  P. Stadler,et al.  RNA structures with pseudo-knots: Graph-theoretical, combinatorial, and statistical properties , 1999, Bulletin of mathematical biology.

[28]  M. Kimura Evolutionary Rate at the Molecular Level , 1968, Nature.

[29]  Yann Ponty,et al.  Fixed-parameter tractable sampling for RNA design with multiple target structures , 2018, BMC Bioinformatics.

[30]  Michael Lappe,et al.  CMView: Interactive contact map visualization and analysis , 2011, Bioinform..

[31]  Rolf Backofen,et al.  INFO-RNA - a fast approach to inverse RNA folding , 2006, Bioinform..

[32]  P. Schuster,et al.  From sequences to shapes and back: a case study in RNA secondary structures , 1994, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[33]  Tomasz Zok,et al.  New algorithms to represent complex pseudoknotted RNA structures in dot-bracket notation , 2017, Bioinform..

[34]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[35]  Peter Clote,et al.  RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory , 2016, BMC Bioinformatics.

[36]  Martin Wattenberg,et al.  Arc diagrams: visualizing structure in strings , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[37]  Tina M. Henkin,et al.  Natural Variability in S-Adenosylmethionine (SAM)-Dependent Riboswitches: S-Box Elements in Bacillus subtilis Exhibit Differential Sensitivity to SAM In Vivo and In Vitro , 2007, Journal of bacteriology.

[38]  Ingrid G. Abfalter,et al.  Computational design of RNAs with complex energy landscapes , 2013, Biopolymers.

[39]  Christian M. Reidys,et al.  Loop homology of bi-secondary structures , 2019, Discret. Math..

[40]  P. Schuster,et al.  Analysis of RNA sequence structure maps by exhaustive enumeration I. Neutral networks , 1995 .

[41]  Daniel Lai,et al.  R-chie: a web server and R package for visualizing RNA secondary structures , 2012, Nucleic acids research.

[42]  Christian M. Reidys,et al.  Sequence‐structure relations of biopolymers , 2015, Bioinform..

[43]  Peter Clote,et al.  Boltzmann probability of RNA structural neighbors and riboswitch detection , 2007, Bioinform..

[44]  B. Berger,et al.  A global sampling approach to designing and reengineering RNA secondary structures , 2012, Nucleic acids research.

[45]  R. Knight,et al.  From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. , 2008, RNA.

[46]  B. Zwaan,et al.  Strong phenotypic plasticity limits potential for evolutionary responses to climate change , 2018, Nature Communications.

[47]  M. Hatzoglou,et al.  A stress-responsive RNA switch regulates VEGF expression , 2008, Nature.

[48]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[49]  Jotun Hein,et al.  Frnakenstein: multiple target inverse RNA folding , 2012, BMC Bioinformatics.

[50]  Margaret S. Ebert,et al.  An mRNA structure in bacteria that controls gene expression by binding lysine. , 2003, Genes & development.

[51]  Yann Ponty,et al.  A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution , 2013, Bioinform..

[52]  Peter F. Stadler,et al.  Design of multi-stable nucleid acid sequences , 2003, German Conference on Bioinformatics.

[53]  P. Schuster,et al.  IR-98-039 / April Continuity in Evolution : On the Nature of Transitions , 1998 .

[54]  B. Waclaw,et al.  Phenotypic Switching Can Speed up Microbial Evolution , 2018, Scientific Reports.

[55]  Fred W. Glover,et al.  Traveling salesman problem heuristics: Leading methods, implementations and latest advances , 2011, Eur. J. Oper. Res..

[56]  P. Stadler,et al.  Design of multistable RNA molecules. , 2001, RNA.

[57]  A. Serganov,et al.  A Decade of Riboswitches , 2013, Cell.

[58]  R. Micura,et al.  Ligand‐Induced Folding of the Adenosine Deaminase A‐Riboswitch and Implications on Riboswitch Translational Control , 2007, Chembiochem : a European journal of chemical biology.

[59]  P. Stadler,et al.  Design of Multi-Stable RNA Molecules , 2000 .

[60]  Michael S. Waterman,et al.  Spaces of RNA Secondary Structures , 1993 .

[61]  Tomasz Zok,et al.  New models and algorithms for RNA pseudoknot order assignment , 2020 .