RNA canonical and non-canonical base pairing types: a recognition method and complete repertoire.

The problem of systematic and objective identification of canonical and non-canonical base pairs in RNA three-dimensional (3D) structures was studied. A probabilistic approach was applied, and an algorithm and its implementation in a computer program that detects and analyzes all the base pairs contained in RNA 3D structures were developed. The algorithm objectively distinguishes among canonical and non-canonical base pairing types formed by three, two and one hydrogen bonds (H-bonds), as well as those containing bifurcated and C-H.X...H-bonds. The nodes of a bipartite graph are used to encode the donor and acceptor atoms of a 3D structure. The capacities of the edges correspond to probabilities computed from the geometry of the donor and acceptor groups to form H-bonds. The maximum flow from donors to acceptors directly identifies base pairs and their types. A complete repertoire of base pairing types was built from the detected H-bonds of all X-ray crystal structures of a resolution of 3.0 A or better, including the large and small ribosomal subunits. The base pairing types are labeled using an extension of the nomenclature recently introduced by Leontis and Westhof. The probabilistic method was implemented in MC-Annotate, an RNA structure analysis computer program used to determine the base pairing parameters of the 3D modeling system MC-Sym.

[1]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[4]  Wolfram Saenger,et al.  Principles of Nucleic Acid Structure , 1983 .

[5]  A. Goldberg,et al.  A new approach to the maximum-flow problem , 1988, JACM.

[6]  G Lapalme,et al.  The combination of symbolic and numerical computation for three-dimensional modeling of RNA. , 1991, Science.

[7]  I. Tinoco APPENDIX 1: Structures of Base Pairs Involving at Least Two Hydrogen Bonds , 1993 .

[8]  David K. Smith Network Flows: Theory, Algorithms, and Applications , 1994 .

[9]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[10]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[11]  Arthur R. Weeks Fundamentals of electronic image processing , 1996, SPIE/IEEE series on imaging science and engineering.

[12]  Jürgen Sühnel,et al.  HBexplore - a new tool for identifying and analysing hydrogen bonding patterns in biological macromolecules , 1996, Comput. Appl. Biosci..

[13]  T. Steitz,et al.  Metals, Motifs, and Recognition in the Crystal Structure of a 5S rRNA Domain , 1997, Cell.

[14]  Ravindra K. Ahuja,et al.  Computational investigations of maximum flow algorithms , 1997 .

[15]  C Massire,et al.  MANIP: an interactive tool for modelling RNA. , 1998, Journal of molecular graphics & modelling.

[16]  P. Schleyer Encyclopedia of computational chemistry , 1998 .

[17]  E Westhof,et al.  Conserved geometrical base-pairing patterns in RNA , 1998, Quarterly Reviews of Biophysics.

[18]  F. Major,et al.  Structural basis for the guanosine requirement of the hairpin ribozyme. , 1999, Biochemistry.

[19]  D. Turner,et al.  10 The Interactions That Shape RNA Structure , 1999 .

[20]  George E. Fox,et al.  Database of non-canonical base pairs found in known RNA structures , 2000, Nucleic Acids Res..

[21]  T. Steitz,et al.  The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. , 2000, Science.

[22]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[23]  C. Vonrhein,et al.  Structure of the 30S ribosomal subunit , 2000, Nature.

[24]  T. Steitz,et al.  The kink‐turn: a new RNA secondary structure motif , 2001, The EMBO journal.

[25]  E. Westhof,et al.  Geometric nomenclature and classification of RNA base pairs. , 2001, RNA.

[26]  P. Gendron,et al.  Quantitative analysis of nucleic acid three-dimensional structures. , 2001, Journal of molecular biology.

[27]  T. Cech,et al.  Structural basis of the enhanced stability of a mutant ribozyme domain and a detailed view of RNA--solvent interactions. , 2001, Structure.

[28]  Sébastien Lemieux,et al.  Nucleic Acids: Qualitative Modeling , 2002 .