Boosting Prediction Performance of Protein-Protein Interaction Hot Spots by Using Structural Neighborhood Properties - (Extended Abstract)

Binding of one protein to another in a highly specific manner to form stable complexes is critical in most biological processes, yet the mechanisms involved in the interaction of proteins are not fully clear. The identification of hot spots, a small subset of binding interfaces that account for the majority of binding free energy, is becoming increasingly important in understanding the principles of protein interactions. Despite experiments like alanine scanning mutagenesis and a variety of computational methods that have been applied to this problem, comparative studies suggest that the development of accurate and reliable solutions is still in its infant stage. We developed PredHS (Prediction of Hot Spots), a computational method that can effectively identify hot spots on protein-binding interfaces by using 38 optimally chosen properties. The optimal combination of features was selected from a set of 324 novel structural neighborhood properties by a two-step feature selection method consisting of a random forest algorithm and a sequential backward elimination method. We evaluated the performance of PredHS using a benchmark of 265 alanine-mutated interface residues (Dataset I) and a trimmed subset (Dataset II) with 10-fold cross-validation. Compared with the state-of-the art approaches, PredHS achieves a significant improvement on the prediction quality, which stems from the new structural neighborhood properties, the novel way of feature generation, as well as the selection power of the proposed two-step method. We further validated the capability of our method by an independent test and obtained promising results.

[1]  T. Clackson,et al.  A hot spot of binding energy in a hormone-receptor interface , 1995, Science.

[2]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[3]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[4]  Pedro A Fernandes,et al.  Hot spots—A review of the protein–protein interface determinant amino‐acid residues , 2007, Proteins.

[5]  Nick V Grishin,et al.  Effective scoring function for protein sequence design , 2003, Proteins.

[6]  Kurt S. Thorn,et al.  ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions , 2001, Bioinform..

[7]  Doheon Lee,et al.  A feature-based approach to modeling protein–protein interaction hot spots , 2009, Nucleic acids research.

[8]  D. Baker,et al.  A simple physical model for binding energy hot spots in protein–protein complexes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Dima Kozakov,et al.  Fragment-based identification of druggable 'hot spots' of proteins using Fourier domain correlation techniques , 2009, Bioinform..

[10]  D. Bailey,et al.  The Binding Interface Database (BID): A Compilation of Amino Acid Hot Spots in Protein Interfaces , 2003, Bioinform..

[11]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[12]  Michael J. Hartshorn,et al.  AstexViewerTM †: a visualisation aid for structure-based drug design , 2002, J. Comput. Aided Mol. Des..

[13]  Yaoqi Zhou,et al.  Consensus scoring for enriching near‐native structures from protein–protein docking decoys , 2009, Proteins.

[14]  Ozlem Keskin,et al.  Analysis and network representation of hotspots in protein interfaces using minimum cut trees , 2010, Proteins.

[15]  Julie C. Mitchell,et al.  An automated decision‐tree approach to predicting protein interaction hot spots , 2007, Proteins.

[16]  R. Timpl,et al.  Structural basis for the high‐affinity interaction of nidogen‐1 with immunoglobulin‐like domain 3 of perlecan , 2001, The EMBO journal.

[17]  Solène Grosdidier,et al.  Identification of hot-spot residues in protein-protein interactions by computational docking , 2008, BMC Bioinformatics.

[18]  Salam A. Assi,et al.  PCRPi: Presaging Critical Residues in Protein interfaces, a new computational tool to chart hot spots in protein interfaces , 2009, Nucleic acids research.

[19]  Xing-Ming Zhao,et al.  APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility , 2010, BMC Bioinformatics.

[20]  Julie C. Mitchell,et al.  KFC2: A knowledge‐based hot spot prediction method based on interface solvation, atomic density, and plasticity features , 2011, Proteins.

[21]  Burkhard Rost,et al.  Protein–Protein Interaction Hotspots Carved into Sequences , 2007, PLoS Comput. Biol..

[22]  A. Bogan,et al.  Anatomy of hot spots in protein interfaces. , 1998, Journal of molecular biology.

[23]  Peter A. Kollman,et al.  Computational alanine scanning of the 1:1 human growth hormone–receptor complex , 2002, J. Comput. Chem..

[24]  Chenhsiung Chan,et al.  Relationship between local structural entropy and protein thermostabilty , 2004, Proteins.

[25]  Jinyan Li,et al.  ‘Double water exclusion’: a hypothesis refining the O-ring theory for the hot spots at protein interfaces , 2009, Bioinform..

[26]  Xiang-Sun Zhang,et al.  Prediction of hot spots in protein interfaces using a random forest model with hybrid features. , 2012, Protein engineering, design & selection : PEDS.

[27]  P. Kollman,et al.  Continuum Solvent Studies of the Stability of DNA, RNA, and Phosphoramidate−DNA Helices , 1998 .

[28]  Ozlem Keskin,et al.  Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy , 2009, Bioinform..

[29]  W. Delano Unraveling hot spots in binding interfaces: progress and challenges. , 2002, Current opinion in structural biology.