Relating destabilizing regions to known functional sites in proteins

BackgroundMost methods for predicting functional sites in protein 3D structures, rely on information on related proteins and cannot be applied to proteins with no known relatives. Another limitation of these methods is the lack of a well annotated set of functional sites to use as benchmark for validating their predictions. Experimental findings and theoretical considerations suggest that residues involved in function often contribute unfavorably to the native state stability. We examine the possibility of systematically exploiting this intrinsic property to identify functional sites using an original procedure that detects destabilizing regions in protein structures. In addition, to relate destabilizing regions to known functional sites, a novel benchmark consisting of a diverse set of hand-curated protein functional sites is derived.ResultsA procedure for detecting clusters of destabilizing residues in protein structures is presented. Individual residue contributions to protein stability are evaluated using detailed atomic models and a force-field successfully applied in computational protein design. The most destabilizing residues, and some of their closest neighbours, are clustered into destabilizing regions following a rigorous protocol. Our procedure is applied to high quality apo-structures of 63 unrelated proteins. The biologically relevant binding sites of these proteins were annotated using all available information, including structural data and literature curation, resulting in the largest hand-curated dataset of binding sites in proteins available to date. Comparing the destabilizing regions with the annotated binding sites in these proteins, we find that the overlap is on average limited, but significantly better than random. Results depend on the type of bound ligand. Significant overlap is obtained for most polysaccharide- and small ligand-binding sites, whereas no overlap is observed for most nucleic acid binding sites. These differences are rationalised in terms of the geometry and energetics of the binding site.ConclusionWe find that although destabilizing regions as detected here can in general not be used to predict binding sites in protein structures, they can provide useful information, particularly on the location of functional sites that bind polysaccharides and small ligands. This information can be exploited in methods for predicting function in protein structures with no known relatives. Our publicly available benchmark of hand-curated functional sites in proteins should help other workers derive and validate new prediction methods.

[1]  N. Ben-Tal,et al.  ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. , 2001, Journal of molecular biology.

[2]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[3]  Alfonso Jaramillo,et al.  Computational protein design is a challenge for implicit solvation models. , 2005, Biophysical journal.

[4]  Adam Godzik,et al.  JAFA: a protein function annotation meta-server , 2006, Nucleic Acids Res..

[5]  J. Warwicker,et al.  Enzyme/non-enzyme discrimination and prediction of enzyme active site location using charge-based methods. , 2004, Journal of molecular biology.

[6]  H. Scheraga,et al.  Accessible surface areas as a measure of the thermodynamic parameters of hydration of peptides. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[7]  J. Thornton,et al.  Searching for functional sites in protein structures. , 2004, Current opinion in chemical biology.

[8]  M. Sternberg,et al.  Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. , 2001, Journal of molecular biology.

[9]  A. Fersht Structure and mechanism in protein science , 1998 .

[10]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[11]  B. Rost,et al.  Analysing six types of protein-protein interfaces. , 2003, Journal of molecular biology.

[12]  J. Gerlt,et al.  Deletion of the omega-loop in the active site of staphylococcal nuclease. 1. Effect on catalysis and stability. , 1991, Biochemistry.

[13]  M. Swindells,et al.  Protein clefts in molecular recognition and function. , 1996, Protein science : a publication of the Protein Society.

[14]  G J Kleywegt,et al.  Recognition of spatial motifs in protein structures. , 1999, Journal of molecular biology.

[15]  Janet M. Thornton,et al.  ProFunc: a server for predicting protein function from 3D structure , 2005, Nucleic Acids Res..

[16]  D. Eisenberg,et al.  Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. , 2001, Journal of molecular biology.

[17]  M. Eisenstein,et al.  Looking at enzymes from the inside out: the proximity of catalytic residues to the molecular centroid can be used for detection of active sites and enzyme-ligand interfaces. , 2005, Journal of molecular biology.

[18]  J. L. Jimenez,et al.  Does structural and chemical divergence play a role in precluding undesirable protein interactions? , 2005, Proteins.

[19]  M. Ondrechen,et al.  THEMATICS: A simple computational predictor of enzyme function from structure , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  I. Luque,et al.  Structural stability of binding sites: Consequences for binding affinity and allosteric effects , 2000, Proteins.

[21]  Gerard J Kleywegt,et al.  A survey of left-handed helices in protein structures. , 2005, Journal of molecular biology.

[22]  G Schreiber,et al.  Stability and function: two constraints in the evolution of barstar and other proteins. , 1994, Structure.

[23]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[24]  R. Greaves,et al.  Active site identification through geometry-based and sequence profile-based calculations: burial of catalytic clefts. , 2005, Journal of molecular biology.

[25]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[26]  Gail J. Bartlett,et al.  Analysis of catalytic residues in enzyme active sites. , 2002, Journal of molecular biology.

[27]  Jürgen Eck,et al.  Metagenomics: An inexhaustible access to nature's diversity , 2006, Biotechnology journal.

[28]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[29]  J A McCammon,et al.  Acetylcholinesterase: role of the enzyme's charge distribution in steering charged ligands toward the active site. , 1998, Biopolymers.

[30]  Constance J Jeffery,et al.  Molecular mechanisms for multitasking: recent crystal structures of moonlighting proteins. , 2004, Current opinion in structural biology.

[31]  D. Baker,et al.  Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design , 2005, Nucleic acids research.

[32]  C. Innis,et al.  Prediction of functional sites in proteins using conserved functional group analysis. , 2004, Journal of molecular biology.

[33]  Jacques van Helden,et al.  Regulatory Sequence Analysis Tools , 2003, Nucleic Acids Res..

[34]  Nicola D. Gold,et al.  SitesBase: a database for structure-based protein–ligand binding site comparisons , 2005, Nucleic Acids Res..

[35]  L Serrano,et al.  Effect of active site residues in barnase on activity and stability. , 1992, Journal of molecular biology.

[36]  Lorenz Wernisch,et al.  Folding free energy function selects native-like protein sequences in the core but not on the surface , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[37]  A. Edwards,et al.  Structural proteomics: a tool for genome annotation. , 2004, Current opinion in chemical biology.

[38]  M Karplus,et al.  Polar hydrogen positions in proteins: Empirical energy placement and neutron diffraction comparison , 1988, Proteins.

[39]  Annabel E. Todd,et al.  Evolution of function in protein superfamilies, from a structural perspective. , 2001, Journal of molecular biology.

[40]  D. Eisenberg,et al.  Inference of protein function from protein structure. , 2005, Structure.

[41]  A. Warshel Electrostatic Origin of the Catalytic Power of Enzymes and the Role of Preorganized Active Sites* , 1998, The Journal of Biological Chemistry.

[42]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[43]  J Moult,et al.  Analysis of the steric strain in the polypeptide backbone of protein molecules , 1991, Proteins.

[44]  E A Merritt,et al.  Raster3D: photorealistic molecular graphics. , 1997, Methods in enzymology.

[45]  Gil Amitai,et al.  Network analysis of protein structures identifies functional residues. , 2004, Journal of molecular biology.

[46]  Gregory A.Petsko and Dagmar Ringe Protein structure and function , 2003 .

[47]  Christophe Combet,et al.  The SuMo server: 3D search for protein functional sites , 2005, Bioinform..

[48]  David S Moss,et al.  Crystal structure of the C. perfringens alpha-toxin with the active site closed by a flexible loop region. , 2002, Journal of molecular biology.

[49]  L. Gierasch,et al.  Mutating the charged residues in the binding pocket of cellular retinoic acid‐binding protein simultaneously reduces its binding affinity to retinoic acid and increases its thermostability , 1992, Proteins.

[50]  J L Sussman,et al.  Acetylcholinesterase: electrostatic steering increases the rate of ligand binding. , 1993, Biochemistry.

[51]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[52]  C. Chothia,et al.  The atomic structure of protein-protein recognition sites. , 1999, Journal of molecular biology.

[53]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[54]  Sung-Hou Kim,et al.  Overview of structural genomics: from structure to function. , 2003, Current opinion in chemical biology.

[55]  David Baker,et al.  Analysis of anisotropic side-chain packing in proteins and application to high-resolution structure prediction. , 2004, Journal of molecular biology.

[56]  Irene T Weber,et al.  Analysis of protein structures reveals regions of rare backbone conformation at functional sites , 2003, Proteins.

[57]  A. Elcock Prediction of functionally important residues based solely on the computed energetics of protein structure. , 2001, Journal of molecular biology.

[58]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[59]  M. Oobatake,et al.  Thermal Stability of Escherichia coli Ribonuclease HI and Its Active Site Mutants in the Presence and Absence of the Mg2+ Ion , 1996, The Journal of Biological Chemistry.

[60]  J. Thornton,et al.  A method for localizing ligand binding pockets in protein structures , 2005, Proteins.

[61]  Vladimir A. Ivanisenko,et al.  PDBSite: a database of the 3D structure of protein functional sites , 2004, Nucleic Acids Res..

[62]  K. Nishikawa,et al.  Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. , 2003, Journal of molecular biology.

[63]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[64]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[65]  R. Abagyan,et al.  Optimal docking area: A new method for predicting protein–protein interaction sites , 2004, Proteins.

[66]  Allegra Via,et al.  pdbFun: mass selection and fast comparison of annotated PDB residues , 2005, Nucleic Acids Res..

[67]  Janet M. Thornton,et al.  PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids , 2004, Nucleic Acids Res..

[68]  H. Wolfson,et al.  Recognition of Functional Sites in Protein Structures☆ , 2004, Journal of Molecular Biology.

[69]  B K Shoichet,et al.  A relationship between protein stability and protein function. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[71]  Gerard J. Kleywegt,et al.  Recognition of spatial motifs in protein structures , 2000 .

[72]  Patricia C. Babbitt,et al.  Automated discovery of 3D motifs for protein function annotation , 2006, Bioinform..

[73]  Nicholas M. Luscombe,et al.  Amino acid?base interactions: a three-dimensional analysis of protein?DNA interactions at an atomic level , 2001, Nucleic Acids Res..