Evolutionary Conservation and Interacting Preference for Identifying Protein-DNA Interactions

Protein-DNA interactions play a central role in many genetic processes of cells. With the growing crystal structures of protein-DNA complexes, the computational approaches are becoming more and more useful for modeling protein-DNA interactions. This paper proposes template-based alignment with a new scoring function which combined the evolutionary conservation and protein-DNA interacting scores of DNA-contact residues. We showed that the combined scoring function is better to model the protein-DNA interactions than applying only one. Our method achieved high accuracy in identifying DNA-binding domains of 69 representative families and with the correlation 0.6 in predicting the binding free energy of the alanine scanning data. By applying the method to the hormone receptor family, it showed that our method can identify the DNA-binding specificity in different subfamilies. The evolutionary conservation is able to reflect the evolution pressure of DNA-contact residues and the interaction preferences can indicate the binding affinity between the protein and DNA. Experimental results show that both the evolution conservation and the DNA-binding capability of the DNA-contact residues are essential for identifying DNA-binding domains and protein-DNA interactions.

[1]  Nicholas M. Luscombe,et al.  Amino acid?base interactions: a three-dimensional analysis of protein?DNA interactions at an atomic level , 2001, Nucleic Acids Res..

[2]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[3]  César Milstein,et al.  Man-made antibodies , 1991, Nature.

[4]  B. O’Malley,et al.  Molecular mechanisms of action of steroid/thyroid receptor superfamily members. , 1994, Annual review of biochemistry.

[5]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[6]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[7]  Yael Mandel-Gutfreund,et al.  Annotating nucleic acid-binding function based on protein structure. , 2003, Journal of molecular biology.

[8]  Kurt S. Thorn,et al.  ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions , 2001, Bioinform..

[9]  Janet M Thornton,et al.  Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity. , 2002, Journal of molecular biology.

[10]  T F Smith,et al.  The art of matchmaking: sequence alignment methods and their structural implications. , 1999, Structure.

[11]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[12]  Michael Famulok,et al.  All you wanted to know about SELEX , 2004, Molecular Biology Reports.

[13]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[14]  Brian W. Matthews,et al.  No code for recognition , 1988, Nature.

[15]  B. Cunningham,et al.  Rational design of receptor-specific variants of human growth hormone. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[16]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[17]  F. Epstein,et al.  The molecular basis of thyroid hormone action. , 1994, The New England journal of medicine.

[18]  A Klug,et al.  Selection of DNA binding sites for zinc fingers using rationally randomized DNA reveals coded interactions. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[19]  T. Clackson,et al.  A hot spot of binding energy in a hormone-receptor interface , 1995, Science.

[20]  N. Seeman,et al.  Sequence-specific Recognition of Double Helical Nucleic Acids by Proteins (base Pairs/hydrogen Bonding/recognition Fidelity/ion Binding) , 2022 .

[21]  A Klug,et al.  Toward a code for the interactions of zinc fingers with DNA: selection of randomized fingers displayed on phage. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Cheng-Yan Kao,et al.  Evolutionary conservation of DNA-contact residues in DNA-binding domains , 2007, Second International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2007).

[23]  D. Baker,et al.  Protein–DNA binding specificity predictions with structural models , 2005, Nucleic acids research.

[24]  D. Baker,et al.  A simple physical model for the prediction and design of protein-DNA interactions. , 2004, Journal of molecular biology.

[25]  B W Matthews,et al.  Protein-DNA interaction. No code for recognition. , 1988, Nature.

[26]  David Botstein,et al.  Promoter-specific binding of Rap1 revealed by genome-wide maps of protein–DNA association , 2001, Nature Genetics.

[27]  G. Clark,et al.  Reference , 2008 .

[28]  H. Margalit,et al.  Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. , 1998, Nucleic acids research.

[29]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[30]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[31]  P. Sigler,et al.  Structural determinants of nuclear receptor assembly on DNA direct repeats , 1995, Nature.

[32]  G. Winter,et al.  Novel folded protein domains generated by combinatorial shuffling of polypeptide segments. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[33]  D. Botstein,et al.  Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF , 2001, Nature.

[34]  Jun-tao Guo,et al.  Quantitative evaluation of protein–DNA interactions using an optimized knowledge-based potential , 2005, Nucleic acids research.

[35]  H. Kono,et al.  Structure‐based prediction of DNA target sites by regulatory proteins , 1999, Proteins.