Knowledge-based prediction of DNA atomic structure from nucleic sequence.

A simple knowledge-based method for DNA atomic structure prediction from nucleic sequence is presented. We used free B-DNA crystal structures to estimate the distribution of trinucleotide base pairs and tetranucleotide base-pair steps conformational coordinates. We used these distributions as a basis to predict the 3D position of the non-hydrogen atoms of the nucleic bases of any arbitrary DNA sequence of any length. The only constraint imposed was that the structure is a B-DNA one with Watson-Crick complementary base pairs. The method was tested on not seen DNA structures with sequence lengths varying from 6bp to 12bp. The obtained predictions have RMSE around 0.5 A for the translational conformational coordinates, and around 5 degrees for the rotational. For the estimation of the nucleic base non-hydrogen atom coordinates the RMSE is around 1.1 A. The knowledge-based method outperformed a technique based on genetic algorithms in the prediction of B-DNA structures.

[1]  C. Hunter,et al.  Sequence-dependent DNA structure. , 1996, BioEssays : news and reviews in molecular, cellular and developmental biology.

[2]  H M Berman,et al.  A standard reference frame for the description of nucleic acid base-pair geometry. , 2001, Journal of molecular biology.

[3]  C. Hunter,et al.  Sequence-dependent DNA structure: dinucleotide conformational maps. , 2000, Journal of molecular biology.

[4]  G. Schroth,et al.  An A-DNA triplet code: thermodynamic rules for predicting A- and B-DNA. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[5]  S. Diekmann,et al.  Definitions and nomenclature of nucleic acid structure parameters. , 1989, Journal of molecular biology.

[6]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[7]  J. Ramshaw,et al.  Amino acid propensities for the collagen triple-helix. , 2000, Biochemistry.

[8]  E. Pednault,et al.  Nucleic acid structure analysis. Mathematics for local Cartesian and helical structure parameters that are truly comparable between structures. , 1994, Journal of molecular biology.

[9]  M. Araúzo-Bravo,et al.  Sequence-dependent conformational energy of DNA derived from molecular dynamics simulations: toward understanding the indirect readout mechanism in protein-DNA recognition. , 2005, Journal of the American Chemical Society.

[10]  Martin J Packer,et al.  Prediction of atomic structure from sequence for double helical DNA oligomers , 2006, Biopolymers.

[11]  Heinz Sklenar,et al.  Molecular dynamics simulations of the 136 unique tetranucleotide sequences of DNA oligonucleotides. I. Research design and results on d(CpG) steps. , 2004, Biophysical journal.

[12]  A. R. Srinivasan,et al.  The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. , 1992, Biophysical journal.

[13]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[14]  B. Matthews,et al.  Structural and genetic analysis of the folding and function of T4 lysozyme , 1996, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[15]  P Shing Ho,et al.  How sequence defines structure: a crystallographic map of DNA structure and conformation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[16]  W. Olson,et al.  3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. , 2003, Nucleic acids research.

[17]  Qing Zhang,et al.  The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema , 2004, Nucleic Acids Res..

[18]  S. Diekmann,et al.  Definitions and nomenclature of nucleic acid structure parameters. , 1989, The EMBO journal.

[19]  W. Olson,et al.  A-form conformational motifs in ligand-bound DNA structures. , 2000, Journal of molecular biology.

[20]  G A Leonard,et al.  High-resolution structure of a mutagenic lesion in DNA. , 1990, Proceedings of the National Academy of Sciences of the United States of America.