DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces

Structural and physical properties of DNA provide important constraints on the binding sites formed on surfaces of DNA-targeting proteins. Characteristics of such binding sites may form the basis for predicting DNA-binding sites from the structures of proteins alone. Such an approach has been successfully developed for predicting protein–protein interface. Here this approach is adapted for predicting DNA-binding sites. We used a representative set of 264 protein–DNA complexes from the Protein Data Bank to analyze characteristics and to train and test a neural network predictor of DNA-binding sites. The input to the predictor consisted of PSI-blast sequence profiles and solvent accessibilities of each surface residue and 14 of its closest neighboring residues. Predicted DNA-contacting residues cover 60% of actual DNA-contacting residues and have an accuracy of 76%. This method significantly outperforms previous attempts of DNA-binding site predictions. Its application to the prion protein yielded a DNA-binding site that is consistent with recent NMR chemical shift perturbation data, suggesting that it can complement experimental techniques in characterizing protein–DNA interfaces.

[1]  Seungwoo Hwang,et al.  Using evolutionary and structural information to predict DNA‐binding sites on DNA‐binding proteins , 2006, Proteins.

[2]  Janet M Thornton,et al.  Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. , 2003, Nucleic acids research.

[3]  Antonina Silkov,et al.  Structural alignment of protein--DNA interfaces: insights into the determinants of binding specificity. , 2005, Journal of molecular biology.

[4]  C. Pabo,et al.  Geometric analysis and comparison of protein-DNA interfaces: why is there no simple code for recognition? , 2000, Journal of molecular biology.

[5]  B. Matthews,et al.  Crystal structure of an engineered cro monomer bound nonspecifically to DNA: Possible implications for nonspecific binding by the wild‐type protein , 1998, Protein science : a publication of the Protein Society.

[6]  Janet M Thornton,et al.  Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity. , 2002, Journal of molecular biology.

[7]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[8]  Rolf Boelens,et al.  Information-driven protein–DNA docking using HADDOCK: it is a matter of flexibility , 2006, Nucleic acids research.

[9]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[10]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[11]  Huan‐Xiang Zhou,et al.  Prediction of protein interaction sites from sequence profile and residue neighbor list , 2001, Proteins.

[12]  J. C. Myers,et al.  Human UP1 as a model for understanding purine recognition in the family of proteins containing the RNA recognition motif (RRM). , 2004, Journal of molecular biology.

[13]  Guillaume Paillard,et al.  Analyzing protein-DNA recognition mechanisms. , 2004, Structure.

[14]  Gildon Choi,et al.  Structural insights into the interaction between prion protein and nucleic acid. , 2006, Biochemistry.

[15]  Kengo Kinoshita,et al.  PreDs: a server for predicting dsDNA-binding site on protein molecular surfaces , 2005, Bioinform..

[16]  Yael Mandel-Gutfreund,et al.  Annotating nucleic acid-binding function based on protein structure. , 2003, Journal of molecular biology.

[17]  R. Kaptein,et al.  Structure and Flexibility Adaptation in Nonspecific and Specific Protein-DNA Complexes , 2004, Science.

[18]  Sarah A. Teichmann,et al.  DBD: a transcription factor prediction database , 2005, Nucleic Acids Res..

[19]  Thomas A Steitz,et al.  Structural insights into the roles of water and the 2' hydroxyl of the P site tRNA in the peptidyl transferase reaction. , 2005, Molecular cell.

[20]  Shandar Ahmad,et al.  Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information , 2004, Bioinform..

[21]  N. Wingreen,et al.  Toward an atomistic model for predicting transcription‐factor binding sites , 2004, Proteins.

[22]  Toshio Hakoshima,et al.  Structural basis for the diversity of DNA recognition by bZIP transcription factors , 2000, Nature Structural Biology.

[23]  D. Theobald,et al.  Structural basis for telomeric single-stranded DNA recognition by yeast Cdc13. , 2004, Journal of molecular biology.

[24]  H. Margalit,et al.  Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. , 1998, Nucleic acids research.

[25]  R. Kaptein,et al.  Plasticity in protein–DNA recognition: lac repressor interacts with its natural operator O1 through alternative conformations of its DNA‐binding domain , 2002, The EMBO journal.

[26]  Janet M. Thornton,et al.  HTHquery: a method for detecting DNA-binding proteins with a helix-turn-helix structural motif , 2005, Bioinform..

[27]  Albert Jeltsch,et al.  Transition from Nonspecific to Specific DNA Interactions along the Substrate-Recognition Pathway of Dam Methyltransferase , 2005, Cell.

[28]  A. Fersht,et al.  Rescuing the function of mutant p53 , 2001, Nature Reviews Cancer.

[29]  P. Cramer,et al.  Structural Basis of Transcription: An RNA Polymerase II Elongation Complex at 3.3 Å Resolution , 2001, Science.

[30]  Vasant Honavar,et al.  Predicting DNA-binding sites of proteins from amino acid sequence , 2006, BMC Bioinformatics.

[31]  D. Baker,et al.  A simple physical model for the prediction and design of protein-DNA interactions. , 2004, Journal of molecular biology.

[32]  Liangjiang Wang,et al.  BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences , 2006, Nucleic Acids Res..

[33]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[34]  M. Bulyk Computational prediction of transcription-factor binding site locations , 2003, Genome Biology.

[35]  Huan‐Xiang Zhou,et al.  Prediction of solvent accessibility and sites of deleterious mutations from protein sequence , 2005, Nucleic acids research.

[36]  Huan-Xiang Zhou,et al.  Prediction of interface residues in protein–protein complexes by a consensus neural network method: Test against NMR data , 2005, Proteins.

[37]  B. Matthews,et al.  Crystal structure of lambda-Cro bound to a consensus operator at 3.0 A resolution. , 1998, Journal of molecular biology.

[38]  C. Chou,et al.  Probing the DNA kink structure induced by the hyperthermophilic chromosomal protein Sac7d , 2005, Nucleic acids research.

[39]  D. Lejeune,et al.  Protein–nucleic acid recognition: Statistical analysis of atomic interactions and influence of DNA structure , 2005, Proteins.

[40]  Dale B. Wigley,et al.  Crystal structure of RecBCD enzyme reveals a machine for processing DNA breaks , 2004, Nature.

[41]  Shandar Ahmad,et al.  PSSM-based prediction of DNA binding sites in proteins , 2005, BMC Bioinformatics.

[42]  A. Fersht,et al.  Subsite binding in an RNase: structure of a barnase-tetranucleotide complex at 1.76-A resolution. , 1994, Biochemistry.

[43]  Nicholas M. Luscombe,et al.  Amino acid?base interactions: a three-dimensional analysis of protein?DNA interactions at an atomic level , 2001, Nucleic Acids Res..

[44]  Matthias Keil,et al.  Pattern recognition strategies for molecular surfaces: III. Binding site prediction with a neural network , 2004, J. Comput. Chem..

[45]  H. Kono,et al.  Structure‐based prediction of DNA target sites by regulatory proteins , 1999, Proteins.

[46]  A M J J Bonvin,et al.  Data‐driven docking: HADDOCK's adventures in CAPRI , 2005, Proteins.