Energy‐based prediction of amino acid‐nucleotide base recognition

Despite decades of investigations, it is not yet clear whether there are rules dictating the specificity of the interaction between amino acids and nucleotide bases. This issue was addressed by determining, in a dataset consisting of 100 high‐resolution protein‐DNA structures, the frequency and energy of interaction between each amino acid and base, and the energetics of water‐mediated interactions. The analysis was carried out using HINT, a non‐Newtonian force field encoding both enthalpic and entropic contributions, and Rank, a geometry‐based tool for evaluating hydrogen bond interactions. A frequency‐ and energy‐based preferential interaction of Arg and Lys with G, Asp and Glu with C, and Asn and Gln with A was found. Not only favorable, but also unfavorable contacts were found to be conserved. Water‐mediated interactions strongly increase the probability of Thr‐A, Lys‐A, and Lys‐C contacts. The frequency, interaction energy, and water enhancement factors associated with each amino acid–base pair were used to predict the base triplet recognized by the helix motif in 45 zinc fingers, which represents an ideal case study for the analysis of one‐to‐one amino acid–base pair contacts. The model correctly predicted 70.4% of 135 amino acid–base pairs, and, by weighting the energetic relevance of each amino acid–base pair to the overall recognition energy, it yielded a prediction rate of 89.7%. © 2008 Wiley Periodicals, Inc. J Comput Chem 2008

[1]  C. Pabo,et al.  Geometric analysis and comparison of protein-DNA interfaces: why is there no simple code for recognition? , 2000, Journal of molecular biology.

[2]  J. Kang,et al.  Correlation between functional and binding activities of designer zinc-finger proteins. , 2007, The Biochemical journal.

[3]  M Suzuki,et al.  A framework for the DNA-protein recognition code of the probe helix in transcription factors: the chemical and stereochemical rules. , 1994, Structure.

[4]  N. Seeman,et al.  Sequence-specific Recognition of Double Helical Nucleic Acids by Proteins (base Pairs/hydrogen Bonding/recognition Fidelity/ion Binding) , 2022 .

[5]  Pietro Cozzini,et al.  Simple, intuitive calculations of free energy of binding for protein-ligand complexes. 2. Computational titration and pH effects in molecular models of neuraminidase-inhibitor complexes. , 2003, Journal of medicinal chemistry.

[6]  J R Desjarlais,et al.  Length-encoded multiplex binding site determination: application to zinc finger proteins. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Glen E Kellogg,et al.  Hydropathic analysis of the free energy differences in anthracycline antibiotic binding to DNA. , 2003, Nucleic acids research.

[8]  J. Berg,et al.  Redesigning the DNA‐binding specificity of a zinc finger protein: A data base‐guided approach , 1992, Proteins.

[9]  M Gerstein,et al.  DNA recognition code of transcription factors. , 1995, Protein engineering.

[10]  A Sarai,et al.  Evaluation of free energy landscape for base–amino acid interactions using ab initio force field and extensive sampling , 2001, Biopolymers.

[11]  A Klug,et al.  Physical basis of a protein-DNA recognition code. , 1997, Current opinion in structural biology.

[12]  C. Pabo,et al.  DNA recognition by Cys2His2 zinc finger proteins. , 2000, Annual review of biophysics and biomolecular structure.

[13]  M Suzuki,et al.  DNA recognition code of transcription factors in the helix-turn-helix, probe helix, hormone receptor, and zinc finger families. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[14]  R. Sauer,et al.  Protein-DNA recognition. , 1984, Annual review of biochemistry.

[15]  Glen Eugene Kellogg,et al.  HINT: A new method of empirical hydrophobic field calculation for CoMFA , 1991, J. Comput. Aided Mol. Des..

[16]  A. R. Srinivasan,et al.  The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. , 1992, Biophysical journal.

[17]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[18]  A Klug,et al.  Selection of DNA binding sites for zinc fingers using rationally randomized DNA reveals coded interactions. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[19]  A. Joachimiak,et al.  Crystal structure of trp represser/operator complex at atomic resolution , 1988, Nature.

[20]  A. Leo,et al.  Substituent constants for correlation analysis. , 1977, Journal of medicinal chemistry.

[21]  D J Segal,et al.  Design of novel sequence-specific DNA-binding proteins. , 2000, Current opinion in chemical biology.

[22]  P Hobza,et al.  Hydrogen bonding and stacking of DNA bases: a review of quantum-chemical ab initio studies. , 1996, Journal of biomolecular structure & dynamics.

[23]  Pietro Cozzini,et al.  Mapping the energetics of water-protein and water-ligand interactions with the "natural" HINT forcefield: predictive tools for characterizing the roles of water in biomolecules. , 2006, Journal of molecular biology.

[24]  D J Segal,et al.  Toward controlling gene expression at will: selection and design of zinc finger domains recognizing each of the 5'-GNN-3' DNA target sequences. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Anna Marabotti,et al.  Free energy of ligand binding to protein: evaluation of the contribution of water molecules by computational methods. , 2004, Current medicinal chemistry.

[26]  Nihon Hassei Seibutsu Gakkai,et al.  Genes to cells , 1996 .

[27]  Glen E. Kellogg,et al.  Hydrophobicity: is LogPo/w more than the sum of its parts? , 2000 .

[28]  M. Oda,et al.  Thermodynamic and kinetic analyses for understanding sequence‐specific DNA recognition , 2000, Genes to cells : devoted to molecular & cellular mechanisms.

[29]  Pietro Cozzini,et al.  Simple, intuitive calculations of free energy of binding for protein-ligand complexes. 3. The free energy contribution of structural water molecules in HIV-1 protease complexes. , 2004, Journal of medicinal chemistry.

[30]  Glen Eugene Kellogg,et al.  The effect of physical organic properties on hydrophobic fields , 1994, J. Comput. Aided Mol. Des..

[31]  D. Rau,et al.  Water release associated with specific binding of gal repressor. , 1995, The EMBO journal.

[32]  A. Leo,et al.  Substituent constants for correlation analysis in chemistry and biology , 1979 .

[33]  H. Margalit,et al.  Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles. , 1995, Journal of molecular biology.

[34]  Janet M Thornton,et al.  Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. , 2003, Nucleic acids research.

[35]  Pietro Cozzini,et al.  Computational titration analysis of a multiprotic HIV-1 protease-ligand complex. , 2004, Journal of the American Chemical Society.

[36]  P E Wright,et al.  Solution structure of the first three zinc fingers of TFIIIA bound to the cognate DNA sequence: determinants of affinity and sequence specificity. , 1997, Journal of molecular biology.

[37]  Christopher A. Hunter,et al.  Sequence-dependent DNA structure: tetranucleotide conformational maps. , 2000 .

[38]  Panayiotis V Benos,et al.  Is there a code for protein-DNA recognition? Probab(ilistical)ly. . . , 2002, BioEssays : news and reviews in molecular, cellular and developmental biology.

[39]  A. Das,et al.  Free‐energy component analysis of 40 protein–DNA complexes: A consensus view on the thermodynamics of binding at the molecular level , 2002, J. Comput. Chem..

[40]  D. Lejeune,et al.  Protein–nucleic acid recognition: Statistical analysis of atomic interactions and influence of DNA structure , 2005, Proteins.

[41]  Stephen Neidle,et al.  Protein and drug interactions in the minor groove of DNA. , 2002, Nucleic acids research.

[42]  Samuel Selvaraj,et al.  Intermolecular and intramolecular readout mechanisms in protein-DNA recognition. , 2004, Journal of molecular biology.

[43]  R L Jernigan,et al.  Consistencies of individual DNA base-amino acid interactions in structures and sequences. , 1995, Nucleic acids research.

[44]  V. Parsegian,et al.  [3] Macromolecules and water: Probing with osmotic stress , 1995 .

[45]  C. Pabo,et al.  Beyond the "recognition code": structures of two Cys2His2 zinc finger/TATA box complexes. , 2001, Structure.

[46]  Nicholas M. Luscombe,et al.  Amino acid?base interactions: a three-dimensional analysis of protein?DNA interactions at an atomic level , 2001, Nucleic Acids Res..

[47]  M. Levitt,et al.  Aromatic Rings Act as Hydrogen Bond Acceptors , 2022 .

[48]  Stefan Grimme,et al.  Systematic quantum chemical study of DNA‐base tautomers , 2004, J. Comput. Chem..

[49]  V. Zhurkin,et al.  DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[50]  J. Schwabe,et al.  The role of water in protein-DNA interactions. , 1997, Current opinion in structural biology.

[51]  C. Pabo,et al.  High-resolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition. , 1998, Structure.

[52]  M. Michael Gromiha,et al.  Free-Energy Maps of Base−Amino Acid Interactions for DNA−Protein Recognition , 1999 .

[53]  Heinz Sklenar,et al.  Molecular dynamics simulations of the 136 unique tetranucleotide sequences of DNA oligonucleotides. I. Research design and results on d(CpG) steps. , 2004, Biophysical journal.

[54]  J R Desjarlais,et al.  Use of a zinc-finger consensus sequence framework and specificity rules to design specific DNA binding proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[55]  J R Desjarlais,et al.  Toward rules relating zinc finger protein sequences and DNA binding site preferences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Anna Marabotti,et al.  Simple, intuitive calculations of free energy of binding for protein-ligand complexes. 1. Models without explicit constrained water. , 2002, Journal of medicinal chemistry.

[57]  D. Segal,et al.  Direct detection of double-stranded DNA: Molecular methods and applications for DNA diagnostics. , 2006, Molecular bioSystems.

[58]  Pietro Cozzini,et al.  Getting it right: modeling of pH, solvent and "nearly" everything else in virtual screening of biological targets. , 2004, Journal of molecular graphics & modelling.

[59]  Janet M Thornton,et al.  Using structural motif templates to identify proteins with DNA binding function. , 2003, Nucleic acids research.

[60]  M. Araúzo-Bravo,et al.  Sequence-dependent conformational energy of DNA derived from molecular dynamics simulations: toward understanding the indirect readout mechanism in protein-DNA recognition. , 2005, Journal of the American Chemical Society.

[61]  A Klug,et al.  Toward a code for the interactions of zinc fingers with DNA: selection of randomized fingers displayed on phage. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[62]  H. Kono,et al.  Protein-DNA recognition patterns and predictions. , 2005, Annual review of biophysics and biomolecular structure.

[63]  S H Kim,et al.  A zinc finger directory for high-affinity DNA recognition. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[64]  G E Kellogg,et al.  Computationally accessible method for estimating free energy changes resulting from site‐specific mutations of biomolecules: Systematic model building and structural/hydropathic analysis of deoxy and oxy hemoglobins , 2001, Proteins.

[65]  S. Selvaraj,et al.  Specificity of protein-DNA recognition revealed by structure-based potentials: symmetric/asymmetric and cognate/non-cognate binding. , 2002, Journal of molecular biology.

[66]  Toby J. Gibson,et al.  Base sequence discrimination by zinc-finger DNA-binding domains , 1991, Nature.

[67]  N. Pavletich,et al.  Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A , 1991, Science.

[68]  C. Barbas,et al.  Building zinc fingers by selection: toward a therapeutic application. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[69]  David J Segal,et al.  Structure of Aart, a designed six-finger zinc finger peptide, bound to DNA. , 2006, Journal of molecular biology.

[70]  Kengo Kinoshita,et al.  Structure‐based prediction of DNA‐binding sites on proteins Using the empirical preference of electrostatic potential and the shape of molecular surfaces , 2004, Proteins.

[71]  Anna Marabotti,et al.  New computational strategy to analyze the interactions of ERα and ERβ with different ERE sequences , 2007, J. Comput. Chem..

[72]  C. Pabo,et al.  Design and selection of novel Cys2His2 zinc finger proteins. , 2001, Annual review of biochemistry.

[73]  Glen E Kellogg,et al.  The Importance of Being Exhaustive. Optimization of Bridging Structural Water Molecules and Water Networks in Models of Biological Systems , 2004, Chemistry & biodiversity.

[74]  Anna Marabotti,et al.  Energetics of the protein-DNA-water interaction , 2007, BMC Structural Biology.

[75]  Glen E Kellogg,et al.  A computational model for anthracycline binding to DNA: tuning groove-binding intercalators for specific sequences. , 2004, Journal of medicinal chemistry.

[76]  D J Segal,et al.  Development of Zinc Finger Domains for Recognition of the 5′-ANN-3′ Family of DNA Sequences and Their Use in the Construction of Artificial Transcription Factors* , 2001, The Journal of Biological Chemistry.

[77]  D. Rau,et al.  Linkage of EcoRI dissociation from its specific DNA recognition site to water activity, salt concentration, and pH: separating their roles in specific and non-specific binding. , 2001, Journal of molecular biology.

[78]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[79]  M. V. Katti,et al.  Amino acid repeat patterns in protein sequences: Their diversity and structural‐functional implications , 2000, Protein science : a publication of the Protein Society.

[80]  H. Kono,et al.  Structure‐based prediction of DNA target sites by regulatory proteins , 1999, Proteins.

[81]  B. Jayaram,et al.  Do water molecules mediate protein-DNA recognition? , 2001, Journal of molecular biology.

[82]  K A Dill,et al.  Additivity Principles in Biochemistry* , 1997, The Journal of Biological Chemistry.

[83]  Brian W. Matthews,et al.  No code for recognition , 1988, Nature.

[84]  Tarun Jain,et al.  The role of water in protein-DNA recognition. , 2004, Annual review of biophysics and biomolecular structure.

[85]  C. Pabo,et al.  Analysis of zinc fingers optimized via phage display: evaluating the utility of a recognition code. , 1999, Journal of molecular biology.

[86]  Bhyravabhotla Jayaram,et al.  Free Energy Analysis of Protein-DNA Binding , 1999 .

[87]  Guillaume Paillard,et al.  Analyzing protein-DNA recognition mechanisms. , 2004, Structure.

[88]  H. Margalit,et al.  Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. , 1998, Nucleic acids research.