Structure-based prediction of bZIP partnering specificity.

Predicting protein interaction specificity from sequence is an important goal in computational biology. We present a model for predicting the interaction preferences of coiled-coil peptides derived from bZIP transcription factors that performs very well when tested against experimental protein microarray data. We used only sequence information to build atomic-resolution structures for 1711 dimeric complexes, and evaluated these with a variety of functions based on physics, learned empirical weights or experimental coupling energies. A purely physical model, similar to those used for protein design studies, gave reasonable performance. The results were improved significantly when helix propensities were used in place of a structurally explicit model to represent the unfolded reference state. Further improvement resulted upon accounting for residue-residue interactions in competing states in a generic way. Purely physical structure-based methods had difficulty capturing core interactions accurately, especially those involving polar residues such as asparagine. When these terms were replaced with weights from a machine-learning approach, the resulting model was able to correctly order the stabilities of over 6000 pairs of complexes with greater than 90% accuracy. The final model is physically interpretable, and suggests specific pairs of residues that are important for bZIP interaction specificity. Our results illustrate the power and potential of structural modeling as a method for predicting protein interactions and highlight obstacles that must be overcome to reach quantitative accuracy using a de novo approach. Our method shows unprecedented performance in predicting protein-protein interaction specificity accurately using structural modeling and suggests that predicting coiled-coil interactions generally may be within reach.

[1]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[2]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[3]  Raphael Guerois,et al.  Energy estimation in protein design. , 2002, Current opinion in structural biology.

[4]  D. Ramji,et al.  CCAAT/enhancer-binding proteins: structure, function and regulation. , 2002, The Biochemical journal.

[5]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[6]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[7]  S. Wodak,et al.  Assessment of CAPRI predictions in rounds 3–5 shows progress in docking procedures , 2005, Proteins.

[8]  N. Andrews,et al.  The Maf transcription factors: regulators of differentiation. , 1997, Trends in biochemical sciences.

[9]  Mona Singh,et al.  Towards predicting coiled-coil protein interactions , 2001, RECOMB.

[10]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[11]  P S Kim,et al.  Repacking protein cores with backbone freedom: structure prediction for coiled coils. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[13]  B. Matthews,et al.  Structural basis of amino acid alpha helix propensity. , 1993, Science.

[14]  R. L. Baldwin,et al.  Helix propensities of the amino acids measured in alanine‐based peptides without helix‐stabilizing side‐chain interactions , 1994, Protein science : a publication of the Protein Society.

[15]  P. S. Kim,et al.  A buried polar interaction can direct the relative orientation of helices in a coiled coil. , 1998, Biochemistry.

[16]  A. Fersht,et al.  Alpha-helix stability in proteins. II. Factors that influence stability at an internal position. , 1992, Journal of molecular biology.

[17]  N R Kallenbach,et al.  Side chain contributions to the stability of alpha-helical structure in peptides. , 1990, Science.

[18]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[19]  C. M. Summa,et al.  Computational de novo design, and characterization of an A(2)B(2) diiron protein. , 2002, Journal of molecular biology.

[20]  Kleanthis G. Xanthopoulos,et al.  Biological Role of the CCAAT/Enhancer-binding Protein Family of Transcription Factors* , 1998, The Journal of Biological Chemistry.

[21]  Andrew M Wollacott,et al.  Virtual interaction profiles of proteins. , 2001, Journal of Molecular Biology.

[22]  J. N. Mark Glover,et al.  Crystal structure of the heterodimeric bZIP transcription factor c-Fos–c-Jun bound to DNA , 1995, Nature.

[23]  Tsonwin Hai,et al.  The molecular biology and nomenclature of the activating transcription factor/cAMP responsive element binding family of transcription factors: activating transcription factor proteins and homeostasis. , 2001, Gene.

[24]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[25]  David Baker,et al.  Computer-based redesign of a protein folding pathway , 2001, Nature Structural Biology.

[26]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[27]  C. Vinson,et al.  Inter-helical interactions in the leucine zipper coiled coil dimer: pH and salt dependence of coupling energy between charged amino acids. , 1998, Journal of molecular biology.

[28]  X. Zeng,et al.  Buried asparagines determine the dimerization specificities of leucine zipper mutants. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[29]  F. Crick,et al.  The packing of α‐helices: simple coiled‐coils , 1953 .

[30]  Stephen L. Mayo,et al.  Design, structure and stability of a hyperthermophilic protein variant , 1998, Nature Structural Biology.

[31]  Arun K. Ramani,et al.  Protein interaction networks from yeast to human. , 2004, Current opinion in structural biology.

[32]  M. Helmer-Citterich,et al.  SH3-SPOT: an algorithm to predict preferred ligands to different members of the SH3 gene family. , 2000, Journal of molecular biology.

[33]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.

[34]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[35]  D. Baker,et al.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. , 2003, Journal of molecular biology.

[36]  K. Struhl,et al.  The GCN4 basic region leucine zipper binds DNA as a dimer of uninterrupted α Helices: Crystal structure of the protein-DNA complex , 1992, Cell.

[37]  A. Keating,et al.  Comprehensive Identification of Human bZIP Interactions with Coiled-Coil Arrays , 2003, Science.

[38]  L Serrano,et al.  Elucidating the folding problem of alpha-helices: local motifs, long-range electrostatics, ionic-strength dependence and prediction of NMR parameters. , 1998, Journal of molecular biology.

[39]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[40]  Gary D Bader,et al.  Functional genomics of intracellular peptide recognition domains with combinatorial biology methods. , 2003, Current opinion in chemical biology.

[41]  S. L. Mayo,et al.  Conformational splitting: A more powerful criterion for dead‐end elimination , 2000, J. Comput. Chem..

[42]  Julia M. Shifman,et al.  Exploring the origins of binding specificity through the computational redesign of calmodulin , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[43]  T. Oas,et al.  Reinterpretation of GCN4-p1 folding kinetics: partial helix formation precedes dimerization in coiled coil folding. , 1999, Journal of molecular biology.

[44]  Sjef Smeekens,et al.  Dimerization specificity of all 67 B-ZIP motifs in Arabidopsis thaliana: a comparison to Homo sapiens B-ZIP motifs. , 2004, Nucleic acids research.

[45]  Luis Serrano,et al.  Elucidating the folding problem of helical peptides using empirical parameters , 1994, Nature Structural Biology.

[46]  H. Hurst Transcription factors 1: bZIP proteins. , 1995, Protein profile.

[47]  Michael R. Green,et al.  Expressing the human genome , 2001, Nature.

[48]  D. Baker,et al.  A simple physical model for binding energy hot spots in protein–protein complexes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[49]  M. Karin,et al.  The role of Jun, Fos and the AP-1 complex in cell-proliferation and transformation. , 1991, Biochimica et biophysica acta.

[50]  B. Berger,et al.  Predicting coiled coils by use of pairwise residue correlations. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[51]  V. Muñoz,et al.  Elucidating the folding problem of helical peptides using empirical parameters. II. Helix macrodipole effects and rational modification of the helical content of natural peptides. , 1995, Journal of molecular biology.

[52]  B. Berger,et al.  betawrap: Successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[53]  V. Hilser,et al.  The enthalpy change in protein folding and binding: Refinement of parameters for structure‐based calculations , 1996, Proteins.

[54]  Tsonwin Hai,et al.  Cross-family dimerization of transcription factors Fos/Jun and ATF/CREB alters DNA binding specificity. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Song Liu,et al.  A knowledge-based energy function for protein-ligand, protein-protein, and protein-DNA complexes. , 2005, Journal of medicinal chemistry.

[56]  W. DeGrado,et al.  A thermodynamic scale for the helix-forming tendencies of the commonly occurring amino acids. , 1990, Science.

[57]  Paul Beroza,et al.  Calculation of amino acid pKaS in a protein from a continuum electrostatic model: Method and sensitivity analysis , 1996, J. Comput. Chem..

[58]  Prisca Boisguerin,et al.  Quantification of PDZ domain specificity, prediction of ligand affinity and rational design of super-binding peptides. , 2004, Journal of molecular biology.

[59]  S. McKnight,et al.  Scissors-grip model for DNA recognition by a family of leucine zipper proteins. , 1989, Science.

[60]  C. Pabo,et al.  High-resolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition. , 1998, Structure.

[61]  T. Herdegen,et al.  Inducible and constitutive transcription factors in the mammalian nervous system: control of gene expression by Jun, Fos and Krox, and CREB/ATF proteins , 1998, Brain Research Reviews.

[62]  I Lasters,et al.  Enhanced dead-end elimination in the search for the global minimum energy conformation of a collection of protein side chains. , 1995, Protein engineering.

[63]  G. Rose,et al.  Reassessing random-coil statistics in unfolded proteins. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[64]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[65]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[66]  Jessica H. Fong,et al.  Predicting specificity in bZIP coiled-coil protein interactions , 2004, Genome Biology.

[67]  H. Bosshard,et al.  Inverse electrostatic effect: electrostatic repulsion in the unfolded state stabilizes a leucine zipper. , 2004, Biochemistry.

[68]  A. Davidson,et al.  Dramatic stabilization of an SH3 domain by a single substitution: roles of the folded and unfolded states. , 2001, Journal of molecular biology.

[69]  Christina Kiel,et al.  A detailed thermodynamic analysis of ras/effector complex interfaces. , 2004, Journal of molecular biology.

[70]  C. Pabo,et al.  Analysis of zinc fingers optimized via phage display: evaluating the utility of a recognition code. , 1999, Journal of molecular biology.

[71]  C. Vinson,et al.  A heterodimerizing leucine zipper coiled coil system for examining the specificity of a position interactions: amino acids I, V, L, N, A, and K. , 2002, Biochemistry.

[72]  P. Harbury,et al.  Automated design of specificity in molecular recognition , 2003, Nature Structural Biology.

[73]  Nir Friedman,et al.  Ab Initio Prediction of Transcription Factor Targets Using Structural Knowledge , 2005, PLoS Comput. Biol..

[74]  David A. Case,et al.  Effective Born radii in the generalized Born approximation: The importance of being perfect , 2002, J. Comput. Chem..

[75]  D B Gordon,et al.  Branch-and-terminate: a combinatorial optimization algorithm for protein design. , 1999, Structure.

[76]  P. S. Kim,et al.  Side-chain repacking calculations for predicting structures and stabilities of heterodimeric coiled coils , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[77]  P. S. Kim,et al.  High-resolution protein design with backbone freedom. , 1998, Science.

[78]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[79]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[80]  Hongyi Zhou,et al.  A physical reference state unifies the structure‐derived potential of mean force for protein folding and binding , 2004, Proteins.

[81]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[82]  J Engel,et al.  An autonomous folding unit mediates the assembly of two-stranded coiled coils. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[83]  Littlewood Td,et al.  Transcription factors 2: helix-loop-helix. , 1995, Protein profile.

[84]  T. Kouzarides,et al.  Leucine zippers of fos, jun and GCN4 dictate dimerization specificity and thereby control DNA binding , 1989, Nature.

[85]  Panayiotis V Benos,et al.  Probabilistic code for DNA recognition by proteins of the EGR family. , 2002, Journal of molecular biology.

[86]  K. Misura,et al.  PROTEINS: Structure, Function, and Bioinformatics 59:15–29 (2005) Progress and Challenges in High-Resolution Refinement of Protein Structure Models , 2022 .

[87]  C. Vinson,et al.  Classification of Human B-ZIP Proteins Based on Dimerization Properties , 2002, Molecular and Cellular Biology.

[88]  M W Parker,et al.  Evidence for an induced-fit mechanism operating in pi class glutathione transferases. , 1998, Biochemistry.

[89]  J. Skolnick,et al.  Prediction of physical protein–protein interactions , 2005, Physical biology.

[90]  Bruce Tidor,et al.  Electrostatic interactions in the GCN4 leucine zipper: Substantial contributions arise from intramolecular interactions enhanced on binding , 1999, Protein science : a publication of the Protein Society.