Fuzzy SVM with a Novel Membership Function for Prediction of Protein-Protein Interaction Sites in Homo sapiens

Predicting residues that participate in protein–protein interactions (PPI) helps to identify the amino acids located at the interface. In this work, experimentally verified 3-D structures of protein complexes are used for building the training model and subsequent prediction protein interactions from sequence information. Fuzzy SVM (F-SVM), which is developed on top of the classical SVM, is an effective method to solve this problem and we demonstrate that the performance of the SVM can further be improved with the use of a custom-designed fuzzy membership function. We evaluate the performances of both SVM and F-SVM on the PPI database of the Homo sapiens organism and evaluate the statistical significance of F-SVM over classical SVM. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used. The F-SVM based residues prediction method exploits the membership function for each pair sequence fragment and in all cases F-SVM improves the performances obtained by the corresponding SVM classifiers. The F-SVM performance on the test samples is measured by area under ROC curve (AUC) as 80.16% which is around 1.55% higher than the classical SVM classifier.

[1]  C. Chothia,et al.  The atomic structure of protein-protein recognition sites. , 1999, Journal of molecular biology.

[2]  Ujjwal Maulik,et al.  Prediction of E.coli Protein-Protein Interaction Sites Using Inter-Residue Distances and High-Quality-Index Features , 2012, ICIS 2012.

[3]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[4]  Ujjwal Maulik,et al.  PPIcons: identification of protein-protein interaction sites in selected organisms , 2013, Journal of Molecular Modeling.

[5]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[6]  S. Jones,et al.  Analysis of protein-protein interaction sites using surface patches. , 1997, Journal of molecular biology.

[7]  Subhadip Basu,et al.  AMS 4.0: consensus prediction of post-translational modifications in protein sequences , 2012, Amino Acids.

[8]  Subhadip Basu,et al.  AMS 3.0: prediction of post-translational modifications , 2010, BMC Bioinformatics.

[9]  Ujjwal Maulik,et al.  Fuzzy clustering of physicochemical and biochemical properties of amino Acids , 2011, Amino Acids.

[10]  R. M. Burnett,et al.  Distribution and complementarity of hydropathy in mutisunit proteins , 1991, Proteins.

[11]  Xiao Wu,et al.  A New Fuzzy SVM based on the Posterior Probability Weighting Membership , 2012, J. Comput..

[12]  S. Jones,et al.  Principles of protein-protein interactions. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.