Structure-Based Predictive Models for Allosteric Hot Spots

In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray [1]. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68–81% of known hotspots, and among total hotspot predictions, 58–67% were actual hotspots. Hence, these models have precision P = 58–67% and recall R = 68–81%. The corresponding models for Feature Set 2 had P = 55–59% and R = 81–92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73–81% and P = 64–71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues.

[1]  T. Steitz,et al.  Crystal structure of lac repressor core tetramer and its implications for DNA looping. , 1995, Science.

[2]  Oliver F. Lange,et al.  Generalized correlation for biomolecular dynamics , 2005, Proteins.

[3]  C. Stanley,et al.  Expression, purification and characterization of human glutamate dehydrogenase (GDH) allosteric regulatory mutations. , 2002, The Biochemical journal.

[4]  Sebastian Doniach,et al.  Toward the mechanism of dynamical couplings and translocation in hepatitis C virus NS3 helicase using elastic network model , 2007, Proteins.

[5]  Andrew L. Lee,et al.  Frameworks for understanding long-range intra-protein communication. , 2009, Current protein & peptide science.

[6]  C. Stanley,et al.  Molecular basis and characterization of the hyperinsulinism/hyperammonemia syndrome: predominance of mutations in exons 11 and 12 of the glutamate dehydrogenase gene. HI/HA Contributing Investigators. , 2000, Diabetes.

[7]  D. Manstein,et al.  Mutations in the relay loop region result in dominant‐negative inhibition of myosin II function in Dictyostelium , 2002, EMBO reports.

[8]  K. Neet,et al.  Transients and cooperativity. A slow transition model for relating transients and cooperative kinetics of enzymes. , 1972, The Journal of biological chemistry.

[9]  K. Hinsen Analysis of domain motions by approximate normal mode calculations , 1998, Proteins.

[10]  R. Nussinov,et al.  Folding and binding cascades: shifts in energy landscapes. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Gary Siuzdak,et al.  The structure of apo human glutamate dehydrogenase details subunit communication and allostery. , 2002, Journal of molecular biology.

[12]  E. Di Cera,et al.  Molecular mechanisms of thrombin function , 1997, Cellular and Molecular Life Sciences CMLS.

[13]  K. Sharp,et al.  Pump‐probe molecular dynamics as a tool for studying protein motion and long range coupling , 2006, Proteins.

[14]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[15]  S. W. Hall,et al.  An Extensive Interaction Interface between Thrombin and Factor V Is Required for Factor V Activation* , 2001, The Journal of Biological Chemistry.

[16]  Chenhsiung Chan,et al.  Relationship between local structural entropy and protein thermostabilty , 2004, Proteins.

[17]  D. Kern,et al.  The role of dynamics in allosteric regulation. , 2003, Current opinion in structural biology.

[18]  L. T. Ten Eyck,et al.  Rapid atomic density methods for molecular shape characterization. , 2001, Journal of molecular graphics & modelling.

[19]  Jeffrey J. Gray,et al.  Contact rearrangements form coupled networks from local motions in allosteric proteins , 2008, Proteins.

[20]  Y. Sanejouand,et al.  Building‐block approach for determining low‐frequency normal modes of macromolecules , 2000, Proteins.

[21]  V. Hilser,et al.  Structure-based calculation of the equilibrium folding pathway of proteins. Correlation with hydrogen exchange protection factors. , 1996, Journal of molecular biology.

[22]  Wenjun Zheng,et al.  Identification of dynamical correlations within the myosin motor domain by the normal mode analysis of an elastic network model. , 2005, Journal of molecular biology.

[23]  M. Lewis,et al.  The Lac repressor: a second generation of structural and functional studies. , 2001, Current opinion in structural biology.

[24]  E. Freire,et al.  Can allosteric regulation be predicted from structure? , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  J. Wyman,et al.  LINKED FUNCTIONS AND RECIPROCAL EFFECTS IN HEMOGLOBIN: A SECOND LOOK. , 1964, Advances in protein chemistry.

[26]  M. Magnuson,et al.  Mutants of glucokinase cause hypoglycaemia- and hyperglycaemia syndromes and their analysis illuminates fundamental quantitative concepts of glucose homeostasis , 1999, Diabetologia.

[27]  Jianpeng Ma,et al.  Allosteric transition pathways in the lactose repressor protein core domains: Asymmetric motions in a homodimer , 2003, Protein science : a publication of the Protein Society.

[28]  Ofer Yifrach,et al.  Principles underlying energetic coupling along an allosteric communication trajectory of a voltage-activated K+ channel , 2007, Proceedings of the National Academy of Sciences.

[29]  Irene Luque,et al.  The linkage between protein folding and functional cooperativity: two sides of the same coin? , 2002, Annual review of biophysics and biomolecular structure.

[30]  V. Hilser,et al.  Intrinsic disorder as a mechanism to optimize allosteric coupling in proteins , 2007, Proceedings of the National Academy of Sciences.

[31]  Guohui Li,et al.  A coarse-grained normal mode approach for macromolecules: an efficient implementation and application to Ca(2+)-ATPase. , 2002, Biophysical journal.

[32]  M. Morales,et al.  Analytical description of the effects of modifiers and of enzyme multivalency upon the steady state catalyzed reaction rate , 1953 .

[33]  Rama Ranganathan,et al.  Allosteric determinants in guanine nucleotide-binding proteins , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[34]  R. Nussinov,et al.  Folding funnels and binding mechanisms. , 1999, Protein engineering.

[35]  Ofer Yifrach,et al.  Energetics of Pore Opening in a Voltage-Gated K+ Channel , 2002, Cell.

[36]  S. Prasad,et al.  Molecular Dissection of Na+ Binding to Thrombin* , 2004, Journal of Biological Chemistry.

[37]  R. MacKinnon,et al.  Revealing the architecture of a K+ channel pore through mutant cycles with a peptide inhibitor. , 1995, Science.

[38]  A. Plaitakis,et al.  Single Amino Acid Substitution (G456A) in the Vicinity of the GTP Binding Domain of Human Housekeeping Glutamate Dehydrogenase Markedly Attenuates GTP Inhibition and Abolishes the Cooperative Behavior of the Enzyme* , 2002, The Journal of Biological Chemistry.

[39]  I. Weber,et al.  Structural model of human glucokinase in complex with glucose and ATP: implications for the mutants that cause hypo- and hyperglycemia. , 1999, Diabetes.

[40]  Haibo Yu,et al.  Mechanochemical Coupling in the Myosin Motor Domain. II. Analysis of Critical Residues , 2007, PLoS Comput. Biol..

[41]  Kazuo Sutoh,et al.  Dictyostelium myosin II mutations that uncouple the converter swing and ATP hydrolysis cycle. , 2003, Biochemistry.

[42]  Andrew L. Lee,et al.  Dynamic coupling and allosteric behavior in a nonallosteric protein. , 2006, Biochemistry.

[43]  R. Ebright,et al.  Dynamically driven protein allostery , 2006, Nature Structural &Molecular Biology.

[44]  R. Nussinov,et al.  Allosteric effects in the marginally stable von Hippel–Lindau tumor suppressor protein and allostery-based rescue mutant design , 2008, Proceedings of the National Academy of Sciences.

[45]  C. Chothia,et al.  The atomic structure of protein-protein recognition sites. , 1999, Journal of molecular biology.

[46]  K. Mann,et al.  Thrombin formation. , 2003, Chest.

[47]  K. Matthews,et al.  Substitutions at histidine 74 and aspartate 278 alter ligand binding and allostery in lactose repressor protein. , 1999, Biochemistry.

[48]  J. Changeux,et al.  Allosteric Mechanisms of Signal Transduction , 2005, Science.

[49]  D. Thirumalai,et al.  Network of dynamically important residues in the open/closed transition in polymerases is strongly conserved. , 2005, Structure.

[50]  R. Nussinov,et al.  Is allostery an intrinsic property of all dynamic proteins? , 2004, Proteins.

[51]  P. Walsh,et al.  Thrombin activation of factor XI on activated platelets requires the interaction of factor XI and platelet glycoprotein Ib alpha with thrombin anion-binding exosites I and II, respectively. , 2003, The Journal of biological chemistry.

[52]  J. Wells,et al.  Searching for new allosteric sites in enzymes. , 2004, Current opinion in structural biology.

[53]  J. I. Izpisúa Belmonte,et al.  Global DNA methylation and transcriptional analyses of human ESC-derived cardiomyocytes , 2014, Protein & Cell.

[54]  M. Maurer,et al.  Examining Thrombin Hydrolysis of the Factor XIII Activation Peptide Segment Leads to a Proposal for Explaining the Cardioprotective Effects Observed with the Factor XIII V34L Mutation* , 2000, The Journal of Biological Chemistry.

[55]  W. Rutter,et al.  Converting trypsin to chymotrypsin: residue 172 is a substrate specificity determinant. , 1994, Biochemistry.

[56]  Tord Snäll,et al.  Reassessing a sparse energetic network within a single protein domain , 2008, Proceedings of the National Academy of Sciences.

[57]  L. Hedstrom Serine protease mechanism and specificity. , 2002, Chemical reviews.

[58]  D. Jacobs,et al.  Protein flexibility predictions using graph theory , 2001, Proteins.

[59]  L. Johnson,et al.  A new allosteric site in glycogen phosphorylase b as a target for drug interactions. , 2000, Structure.

[60]  E. Di Cera,et al.  Thrombin is a Na(+)-activated enzyme. , 1992, Biochemistry.

[61]  C A Smith,et al.  Active site comparisons highlight structural similarities between myosin and other P-loop proteins. , 1996, Biophysical journal.

[62]  J. Spudich,et al.  Structure-function studies of the myosin motor domain: importance of the 50-kDa cleft. , 1996, Molecular biology of the cell.

[63]  Y. Cheng,et al.  Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reaction. , 1973, Biochemical pharmacology.

[64]  C. Stanley,et al.  Hyperinsulinism and Hyperammonemia Syndrome: Report of Twelve Unrelated Patients , 2001, Pediatric Research.

[65]  C. Stanley,et al.  Structures of bovine glutamate dehydrogenase complexes elucidate the mechanism of purine regulation. , 2001, Journal of molecular biology.

[66]  G. Chang,et al.  Crystal Structure of the Lactose Operon Repressor and Its Complexes with DNA and Inducer , 1996, Science.

[67]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[68]  M Karplus,et al.  Small-world view of the amino acids that play a key role in protein folding. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[69]  Michael D. Daily,et al.  Local motions in a benchmark of allosteric proteins , 2007, Proteins.

[70]  D E Wemmer,et al.  Two-state allosteric behavior in a single-domain signaling protein. , 2001, Science.

[71]  L. Lorand,et al.  The transpeptidase system which crosslinks fibrin by gamma-glutamyle-episilon-lysine bonds. , 1968, Biochemical and biophysical research communications.

[72]  Tal Pupko,et al.  Structural Genomics , 2005 .

[73]  G Waksman,et al.  Unexpected crucial role of residue 225 in serine proteases. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[74]  V. Hilser,et al.  Ensemble‐based signatures of energy propagation in proteins: A new view of an old phenomenon , 2005, Proteins.

[75]  Jeffrey W. Peng,et al.  Substrate recognition reduces side-chain flexibility for conserved hydrophobic residues in human Pin1. , 2007, Structure.

[76]  R. Daniel,et al.  L-glutamate dehydrogenases: distribution, properties and mechanism. , 1993, Comparative biochemistry and physiology. B, Comparative biochemistry.

[77]  Andrew L. Lee,et al.  Ligand-dependent dynamics and intramolecular signaling in a PDZ domain. , 2004, Journal of molecular biology.

[78]  J H Miller,et al.  Lac repressor genetic map in real space. , 1997, Trends in biochemical sciences.

[79]  Fabien Cailliez,et al.  Probing protein mechanics: residue-level properties and their use in defining domains. , 2004, Biophysical journal.

[80]  R. Nussinov,et al.  Residues crucial for maintaining short paths in network communication mediate signaling in proteins , 2006, Molecular systems biology.

[81]  Jeffrey J. Gray,et al.  Allosteric Communication Occurs via Networks of Tertiary and Quaternary Motions in Proteins , 2009, PLoS Comput. Biol..

[82]  Rama Ranganathan,et al.  Structural Determinants of Allosteric Ligand Activation in RXR Heterodimers , 2004, Cell.

[83]  E. Di Cera,et al.  Molecular mapping of thrombin‐receptor interactions , 2001, Proteins.

[84]  Vincent J Hilser,et al.  Local conformational fluctuations can modulate the coupling between proton binding and global structural transitions in proteins. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[85]  B. Goldin,et al.  L-Glutamate Dehydrogenases* , 1971 .

[86]  D. Danley,et al.  Discovery of a human liver glycogen phosphorylase inhibitor that lowers blood glucose in vivo. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[87]  Philip E. Bourne,et al.  Identifying allosteric fluctuation transitions between different protein conformational states as applied to Cyclin Dependent Kinase 2 , 2007, BMC Bioinformatics.

[88]  T. L. Hill Effect of Nearest Neighbor Substrate Interactions on the Rate of Enzyme and Catalytic Reactions , 1952 .

[89]  C. Chennubhotla,et al.  Markov propagation of allosteric effects in biomolecular systems: application to GroEL–GroES , 2006, Molecular systems biology.

[90]  E. Schaftingen,et al.  Study of the regulatory properties of glucokinase by site-directed mutagenesis: conversion of glucokinase to an enzyme with high affinity for glucose. , 2000, Diabetes.

[91]  Debajyoti Datta,et al.  An allosteric circuit in caspase-1. , 2008, Journal of molecular biology.

[92]  D. Dryden,et al.  Allostery without conformational change , 1984, European Biophysics Journal.

[93]  J. Morser,et al.  TAFI, or Plasma Procarboxypeptidase B, Couples the Coagulation and Fibrinolytic Cascades through the Thrombin-Thrombomodulin Complex* , 1996, The Journal of Biological Chemistry.

[94]  J. Janin,et al.  Dissecting protein–protein recognition sites , 2002, Proteins.

[95]  C Cruz,et al.  Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as "spacers" which do not require a specific sequence. , 1994, Journal of molecular biology.

[96]  David A Agard,et al.  Intramolecular signaling pathways revealed by modeling anisotropic thermal diffusion. , 2005, Journal of molecular biology.

[97]  H. Wolfson,et al.  Protein functional epitopes: hot spots, dynamics and combinatorial libraries. , 2001, Current opinion in structural biology.

[98]  Ad Bax,et al.  Quaternary structure of hemoglobin in solution , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[99]  P. Walsh,et al.  Thrombin Activation of Factor XI on Activated Platelets Requires the Interaction of Factor XI and Platelet Glycoprotein Ibα with Thrombin Anion-binding Exosites I and II, Respectively* , 2003, Journal of Biological Chemistry.

[100]  E. Di Cera,et al.  The Na+ Binding Site of Thrombin (*) , 1995, The Journal of Biological Chemistry.

[101]  Y. Sanejouand,et al.  A new approach for determining low‐frequency normal modes in macromolecules , 1994 .

[102]  K. Nogami,et al.  Exosite-interactive Regions in the A1 and A2 Domains of Factor VIII Facilitate Thrombin-catalyzed Cleavage of Heavy Chain* , 2005, Journal of Biological Chemistry.

[103]  R. Nussinov,et al.  Folding and binding cascades: Dynamic landscapes and population shifts , 2008, Protein science : a publication of the Protein Society.

[104]  J. Changeux,et al.  ON THE NATURE OF ALLOSTERIC TRANSITIONS: A PLAUSIBLE MODEL. , 1965, Journal of molecular biology.

[105]  C. Stanley,et al.  Molecular Basis and Characterization of the Hyperinsulinism/Hyperammonemia Syndrome Predominance of Mutations in Exons 11 and 12 of the Glutamate Dehydrogenase Gene , 2000 .

[106]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[107]  R. Nussinov,et al.  Protein binding versus protein folding: the role of hydrophilic bridges in protein associations. , 1997, Journal of molecular biology.

[108]  C. Chennubhotla,et al.  Intrinsic dynamics of enzymes in the unbound state and relation to allosteric regulation. , 2007, Current opinion in structural biology.

[109]  R. Yasuda,et al.  Modulation of actin filament sliding by mutations of the SH2 cysteine in Dictyostelium myosin II. , 1997, Biochemical and biophysical research communications.

[110]  Jian Zhang,et al.  Conformational transition pathway in the allosteric process of human glucokinase , 2006, Proceedings of the National Academy of Sciences.

[111]  F. S. Mathews,et al.  Structural identification of the pathway of long-range communication in an allosteric enzyme , 2008, Proceedings of the National Academy of Sciences.

[112]  T Shimada,et al.  Mutational Analysis of the Switch II Loop ofDictyostelium Myosin II* , 1998, The Journal of Biological Chemistry.

[113]  Carl Frieden Kinetic aspects of regulation of metabolic processes. The hysteretic enzyme concept. , 1970, The Journal of biological chemistry.

[114]  V. L. Rath,et al.  Human liver glycogen phosphorylase inhibitors bind at a new allosteric site. , 2000, Chemistry & biology.

[115]  Tirion,et al.  Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. , 1996, Physical review letters.

[116]  E. Di Cera,et al.  An allosteric switch controls the procoagulant and anticoagulant activities of thrombin. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[117]  R. Abagyan,et al.  Predictions of protein flexibility: First‐order measures , 2004, Proteins.

[118]  C. Esmon The protein C pathway. , 2003, Critical care medicine.

[119]  H. Schaffhauser,et al.  Allosteric approaches to the targeting of G-protein-coupled receptors for novel drug discovery: a critical assessment. , 2007, Biochemical pharmacology.

[120]  Teruyuki Nishimura,et al.  Structural basis for allosteric regulation of the monomeric allosteric enzyme human glucokinase. , 2004, Structure.

[121]  D. Patel,et al.  Concerted motions in HIV-1 TAR RNA may allow access to bound state conformations: RNA dynamics from NMR residual dipolar couplings. , 2002, Journal of molecular biology.

[122]  A. Fersht Structure and mechanism in protein science , 1998 .

[123]  M. Stoffel,et al.  Nonsense mutation in the glucokinase gene causes early-onset non-insulin-dependent diabetes mellitus , 1992, Nature.

[124]  Victoria A. Higman,et al.  Uncovering network systems within protein structures. , 2003, Journal of molecular biology.

[125]  M. Lewis,et al.  A closer view of the conformation of the Lac repressor bound to operator , 2000, Nature Structural Biology.

[126]  E. Di Cera,et al.  Thrombin allostery. , 2007, Physical chemistry chemical physics : PCCP.

[127]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[128]  J H Miller,et al.  Genetic studies of the lac repressor. I. Correlation of mutational sites with specific amino acid residues: construction of a colinear gene-protein map. , 1977, Journal of molecular biology.

[129]  J. Lee,et al.  A linear correlation between the energetics of allosteric communication and protein flexibility in the Escherichia coli cyclic AMP receptor protein revealed by mutation-induced changes in compressibility and amide hydrogen-deuterium exchange. , 2004, Biochemistry.

[130]  R. Jernigan,et al.  Global ribosome motions revealed with elastic network model. , 2004, Journal of structural biology.

[131]  S. Prasad,et al.  Residue Asp-189 Controls both Substrate Binding and the Monovalent Cation Specificity of Thrombin* , 2004, Journal of Biological Chemistry.

[132]  E. Cera,et al.  Rational engineering of activity and specificity in a serine protease , 1997, Nature Biotechnology.

[133]  L. Lorand,et al.  The transpeptidase system which crosslinks fibrin by γ-glutamyl-ε-lysine bonds , 1968 .

[134]  M. Mansour,et al.  Anilinoquinazoline inhibitors of fructose 1,6-bisphosphatase bind at a novel allosteric site: synthesis, in vitro characterization, and X-ray crystallography. , 2002, Journal of medicinal chemistry.

[135]  G Vriend,et al.  WHAT IF: a molecular modeling and drug design program. , 1990, Journal of molecular graphics.

[136]  K. Hinsen,et al.  Analysis of domain motions in large proteins , 1999, Proteins.

[137]  D. Koshland,et al.  Comparison of experimental binding data and theoretical models in proteins containing subunits. , 1966, Biochemistry.