Identification of Hotspots in Protein-Protein Interactions Based on Recursive Feature Elimination

The study of protein-protein interactions and protein structure through computational methods is critical to understand protein function. Hot spot residues play an important role in bioinformatics to reveal life activities. However, conventional hot spots prediction methods may face great challenges. This paper proposes a hot spot prediction method based on feature selection method SVM-RFE to improve the training performance. SMOTE based oversampling is used to adds new samples to avoid an overfitting classifier. SVM-RFE is then invoked to obtained optimal feature subset. Finally, a feature-based SVM is created to predict the hot spots. Experimental results indicate that the performance of hot spots prediction has been significantly improved compared with the previous methods.

[1]  A. Bogan,et al.  Anatomy of hot spots in protein interfaces. , 1998, Journal of molecular biology.

[2]  G. Weiss,et al.  Combinatorial alanine-scanning. , 2001, Current opinion in chemical biology.

[3]  Kurt S. Thorn,et al.  ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions , 2001, Bioinform..

[4]  D. Baker,et al.  A simple physical model for binding energy hot spots in protein–protein complexes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[6]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[7]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[8]  D. Bailey,et al.  The Binding Interface Database (BID): A Compilation of Amino Acid Hot Spots in Protein Interfaces , 2003, Bioinform..

[9]  David E. Kim,et al.  Computational Alanine Scanning of Protein-Protein Interfaces , 2004, Science's STKE.

[10]  R. Nussinov,et al.  Hot regions in protein--protein interactions: the organization and contribution of structurally conserved hot spot residues. , 2005, Journal of molecular biology.

[11]  Richard M. Jackson,et al.  Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces , 2006, Bioinform..

[12]  Burkhard Rost,et al.  ISIS: interaction sites identified from sequence , 2007, Bioinform..

[13]  M. Šikić,et al.  PSAIA – Protein Structure and Interaction Analyzer , 2008, BMC Structural Biology.

[14]  Julie C. Mitchell,et al.  An automated decision‐tree approach to predicting protein interaction hot spots , 2007, Proteins.

[15]  Julie C. Mitchell,et al.  KFC Server: interactive forecasting of protein interaction hot spots , 2008, Nucleic Acids Res..

[16]  Doheon Lee,et al.  A feature-based approach to modeling protein–protein interaction hot spots , 2009, Nucleic acids research.

[17]  Ozlem Keskin,et al.  Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy , 2009, Bioinform..

[18]  Xing-Ming Zhao,et al.  APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility , 2010, BMC Bioinformatics.

[19]  Ganapati Panda,et al.  Efficient Localization of Hot Spots in Proteins Using a Novel S-Transform Based Filtering Approach , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Juan Fernández-Recio,et al.  SKEMPI: a Structural Kinetic and Energetic database of Mutant Protein Interactions and its use in empirical models , 2012, Bioinform..

[21]  Xiaolong Zhang,et al.  Prediction of Hot Spots at Protein-Protein Interface , 2013 .

[22]  Xiaolong Zhang,et al.  Protein structure prediction with local adjust tabu search algorithm , 2014, BMC Bioinformatics.

[23]  F. Agakov,et al.  Application of high-dimensional feature selection: evaluation for genomic prediction in man , 2015, Scientific Reports.

[24]  T. Barata,et al.  Identification of Protein–Excipient Interaction Hotspots Using Computational Approaches , 2016, International journal of molecular sciences.

[25]  O. Keskin,et al.  Predicting Protein-Protein Interactions from the Molecular to the Proteome Level. , 2016, Chemical reviews.