Analysis of RNA sequence structure maps by exhaustive enumeration I. Neutral networks

SummaryGlobal relations between RNA sequences and secondary structures are understood as mappings from sequence space into shape space. These mappings are investigated by exhaustive folding of allGC andAU sequences with chain lengths up to 30. The computed structural data are evaluated through exhaustive enumeration and used as an exact reference for testing analytical results derived from mathematical models and sampling based on statistical methods. Several new concepts of RNA sequence to secondary structure mappings are investigated, among them that ofneutral networks (being sets of sequences folding into the same structure). Exhaustive enumeration allows to test several previously suggested relations: the number of (minimum free energy) secondary structures as a function of the chain length as well as the frequency distribution of structures at constant chain length (commonly resulting in generalized forms ofZipf's law).ZusammenfassungDie globalen Benziehungen zwischen RNA-Sequenzen und Sekundärstrukturen werden als Abbildungen aus einem Raum aller Sequenzen in einen Raum aller Strukturen aufgefaßt. Diese Abbildungen werden durch Falten aller binären Sequenzen desGC-undAU-Alphabets mit Kettenlängen bis zun=30 untersucht. Die berechneten Strukturdaten werden durch vollständiges Abzählen ausgewertet und als eine exakte Referenz zum Überprüfen analytischer Resultate aus mathematischen Modellen sowie zum Testen statistisch erhobener Proben verwendet. Einige neuartige Konzepte zur Beschreibung der Beziehungen zwischen Sequenzen und Strukturen werden eingehend untersucht, unter ihnen der Begriff derneutralen Netzwerke. Ein neutrales Netzwerk besteht aus allen Sequenzen, die eine bestimmte Struktur ausbilden. Vollständiges Abzählen ermöglicht beispielsweise die Bestimmung aller Strukturen minimaler freier Energie in Abhängigkeit von der Kettenlänge ebenso wie die Bestimmung der Häufigkeitsverteilungen der Strukturen bei konstanten Kettenlängen. Die letzteren folgen einer verallgemeinerten FormZipfschen Gesetzes.

[1]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[2]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[3]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[4]  R. Cedergren The evolution of 5s RNA secondary structures , 1978 .

[5]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[6]  W. Salser Globin mRNA sequences: analysis of base pairing and evolutionary implications. , 1978, Cold Spring Harbor symposia on quantitative biology.

[7]  D Sankofff,et al.  The evolution of 5S RNA secondary structures. , 1978, Canadian journal of biochemistry.

[8]  M. Waterman Secondary Structure of Single-Stranded Nucleic Acidst , 1978 .

[9]  David Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[10]  Paulien Hogeweg,et al.  Energy directed folding of RNA sequences , 1984, Nucleic Acids Res..

[11]  H. M. Martinez,et al.  An RNA folding rule , 1984, Nucleic Acids Res..

[12]  A A Mironov,et al.  A kinetic approach to the prediction of RNA secondary structures. , 1985, Journal of biomolecular structure & dynamics.

[13]  A. Mironov,et al.  RNA secondary structure formation during transcription. , 1986, Journal of biomolecular structure & dynamics.

[14]  D. Turner,et al.  Improved free-energy parameters for predictions of RNA duplex stability. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[15]  P. Schuster,et al.  A computer model of evolutionary optimization. , 1987, Biophysical chemistry.

[16]  T. Cech,et al.  Conserved sequences and structures of group I introns: building an active site for RNA catalysis--a review. , 1988, Gene.

[17]  D. Turner,et al.  Improved predictions of secondary structures for RNA. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[18]  P. Hogeweg,et al.  Pattern analysis of RNA secondary structure similarity and consensus of minimal-energy folding. , 1989, Journal of molecular biology.

[19]  Conclusions , 1989 .

[20]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[21]  D. Turner,et al.  Effects of GA mismatches on the structure and thermodynamics of RNA internal loops. , 1990, Biochemistry.

[22]  M. Zuker,et al.  Common structures of the 5' non-coding RNA in enteroviruses and rhinoviruses. Thermodynamical stability and statistical significance. , 1990, Journal of molecular biology.

[23]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[24]  P. Schuster,et al.  Statistics of landscapes based on free energies, replication and degradation rate constants of RNA secondary structures , 1991 .

[25]  David Haussler,et al.  Stochastic Context-Free Grammars in Computational Biology:Applications to Modeling RNA , 1993 .

[26]  P. Schuster,et al.  Statistics of RNA secondary structures , 1993, Biopolymers.

[27]  A. Eschenmoser,et al.  Hexose nucleic acids , 1993 .

[28]  Weinberger,et al.  RNA folding and combinatory landscapes. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[29]  R. C. Underwood,et al.  THE APPLICATION OF STOCHASTIC CONTEXT-FREE GRAMMARS TO FOLDING, ALIGNING AND MODELING HOMOLOGOUS RNA SEQUENCES , 1993 .

[30]  D. Haussler,et al.  Stochastic context-free grammars for modeling RNA , 1993, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[31]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[32]  P. Schuster,et al.  From sequences to shapes and back: a case study in RNA secondary structures , 1994, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[33]  Peter F. Stadler,et al.  Landscapes: Complex Optimization Problems and Biopolymer Structures , 1994, Comput. Chem..

[34]  Non-Watson-Crick base pairs and RNA structure , 1994 .

[35]  K. Flaherty,et al.  Model for an RNA tertiary interaction from the structure of an intermolecular complex between a GAAA tetraloop and an RNA helix , 1994, Nature.

[36]  K. Flaherty,et al.  Three-dimensional structure of a hammerhead ribozyme , 1994, Nature.

[37]  P. Schuster,et al.  How to search for RNA structures. Theoretical concepts in evolutionary biotechnology. , 1995, Journal of biotechnology.

[38]  P. Schuster,et al.  Analysis of RNA sequence structure maps by exhaustive enumeration II. Structures of neutral networks and shape space covering , 1996 .

[39]  P. Schuster,et al.  Generic properties of combinatory maps: neutral networks of RNA secondary structures. , 1997, Bulletin of mathematical biology.