Protein-protein interface prediction based on hexagon structure similarity

Studies on protein-protein interaction are important in proteome research. How to build more effective models based on sequence information, structure information and physicochemical characteristics, is the key technology in protein-protein interface prediction. In this paper, we study the protein-protein interface prediction problem. We propose a novel method for identifying residues on interfaces from an input protein with both sequence and 3D structure information, based on hexagon structure similarity. Experiments show that our method achieves better results than some state-of-the-art methods for identifying protein-protein interface. Comparing to existing methods, our approach improves F-measure value by at least 0.03. On a common dataset consisting of 41 complexes, our method has overall precision and recall values of 63% and 57%. On Benchmark v4.0, our method has overall precision and recall values of 55% and 56%. On CAPRI targets, our method has overall precision and recall values of 52% and 55%.

[1]  Sandor Vajda,et al.  CAPRI: A Critical Assessment of PRedicted Interactions , 2003, Proteins.

[2]  Zhiping Weng,et al.  Protein–protein docking benchmark version 4.0 , 2010, Proteins.

[3]  Fan Jiang,et al.  Prediction of protein-protein binding site by using core interface residue and support vector machine , 2008, BMC Bioinformatics.

[4]  Qiang Yang,et al.  The choice of null distributions for detecting gene-gene interactions in genome-wide association studies , 2011, BMC Bioinformatics.

[5]  Mark N. Wass,et al.  Challenges for the prediction of macromolecular interactions. , 2011, Current opinion in structural biology.

[6]  Sam Ansari,et al.  Statistical analysis of predominantly transient protein–protein interfaces , 2005, Proteins.

[7]  Stefano Alcaro,et al.  GBPM: GRID-based pharmacophore model: concept and application studies to protein-protein recognition , 2006, Bioinform..

[8]  Sandor Vajda,et al.  ClusPro: an automated docking and discrimination method for the prediction of protein complexes , 2004, Bioinform..

[9]  R. Abagyan,et al.  Identification of protein-protein interaction sites from docking energy landscapes. , 2004, Journal of molecular biology.

[10]  R. Raz,et al.  ProMate: a structure based prediction program to identify the location of protein-protein binding sites. , 2004, Journal of molecular biology.

[11]  Huan-Xiang Zhou,et al.  Interaction-site prediction for protein complexes: a critical assessment , 2007, Bioinform..

[12]  Zhiping Weng,et al.  ZDOCK server: interactive docking prediction of protein-protein complexes and symmetric multimers , 2014, Bioinform..

[13]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[14]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[15]  Zhiping Weng,et al.  A combination of rescoring and refinement significantly improves protein docking performance , 2008, Proteins.

[16]  Huan-Xiang Zhou,et al.  meta-PPISP: a meta web server for protein-protein interaction site prediction , 2007, Bioinform..

[17]  O. Schueler‐Furman,et al.  Progress in protein–protein docking: Atomic resolution predictions in the CAPRI experiment using RosettaDock with an improved treatment of side‐chain flexibility , 2005, Proteins.

[18]  M. Schroeder,et al.  Using protein binding site prediction to improve protein docking. , 2008, Gene.

[19]  Z. Weng,et al.  Integrating atom‐based and residue‐based scoring functions for protein–protein docking , 2011, Protein science : a publication of the Protein Society.

[20]  Vasant Honavar,et al.  HomPPI: a class of sequence homology based protein-protein interface prediction methods , 2011, BMC Bioinformatics.

[21]  Ruth Nussinov,et al.  SiteEngines: recognition and comparison of binding sites and protein–protein interfaces , 2005, Nucleic Acids Res..

[22]  Z. Weng,et al.  ZDOCK: An initial‐stage protein‐docking algorithm , 2003, Proteins.

[23]  Doheon Lee,et al.  A feature-based approach to modeling protein–protein interaction hot spots , 2009, Nucleic acids research.

[24]  Jihong Guan,et al.  PredUs: a web server for predicting protein interfaces using structural neighbors , 2011, Nucleic Acids Res..

[25]  C. Dominguez,et al.  HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. , 2003, Journal of the American Chemical Society.

[26]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[27]  David R. Westhead,et al.  Improved prediction of protein-protein binding sites using a support vector machines approach. , 2005, Bioinformatics.

[28]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Shuai Cheng Li,et al.  Residues with Similar Hexagon Neighborhoods Share Similar Side-Chain Conformations , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  Song Liu,et al.  Protein binding site prediction using an empirical scoring function , 2006, Nucleic acids research.

[31]  Dusanka Janezic,et al.  ProBiS: a web server for detection of structurally similar protein binding sites , 2010, Nucleic Acids Res..

[32]  Catalin C. Barbacioru,et al.  The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies , 2008, BMC Bioinformatics.