The Disulfide Connectivity Prediction with Support Vector Machine and Behavior Knowledge Space

A disulfide bond, formed by two oxidized cysteines, plays an important role in the protein folding and structure stability, and it may regulate protein functions. The disulfide connectivity prediction problem is to reveal the correct information of disulfide connectivity in the target protein. It is difficult because the number of possible patterns grows rapidly with respect to the number of cysteines. In this paper, we discover some rules to discriminate the patterns with high accuracy in various methods. Then, we propose the pattern-wise and pairwise BKS (behavior knowledge space) methods to fuse multiple classifiers constructed by the SVM (support vector machine) methods. Furthermore, we combine the CSP (cysteine separation profile) method to form our hybrid method. The prediction accuracy of our hybrid method in SP39 dataset with 4-fold cross-validation is increased to 69.1%, which is better than the best previous result 65.9%.

[1]  Piero Fariselli,et al.  Prediction of disulfide connectivity in proteins with machine-learning methods and correlated mutations , 2013, BMC Bioinformatics.

[3]  Paolo Frasconi,et al.  A two-stage SVM architecture for predicting the disulfide bonding state of cysteines , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[4]  Paolo Frasconi,et al.  Disulfide connectivity prediction using recursive neural networks and evolutionary information , 2004, Bioinform..

[5]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[6]  Pierre Baldi,et al.  Large-Scale Prediction of Disulphide Bond Connectivity , 2004, NIPS.

[7]  Fabio Roli,et al.  The Behavior Knowledge Space Fusion Method: Analysis of Generalization Error and Strategies for Performance Improvement , 2003, Multiple Classifier Systems.

[8]  Piero Fariselli,et al.  Prediction of disulfide connectivity in proteins , 2001, Bioinform..

[9]  Jenn-Kang Hwang,et al.  Predicting disulfide connectivity patterns , 2007, Proteins.

[10]  Piero Fariselli,et al.  Prediction of the disulfide‐bonding state of cysteines in proteins at 88% accuracy , 2002, Protein science : a publication of the Protein Society.

[11]  András Fiser,et al.  Predicting disulfide bond connectivity in proteins by correlated mutations analysis , 2008, Bioinform..

[12]  Yi Pan,et al.  Cysteine separations profiles on protein secondary structure infer disulfide connectivity , 2006, 2006 IEEE International Conference on Granular Computing.

[13]  Jenn-Kang Hwang,et al.  Prediction of the bonding states of cysteines Using the support vector machines based on multiple feature vectors and cysteine state sequences , 2004, Proteins.

[14]  Cheng-Yan Kao,et al.  Improving disulfide connectivity prediction with sequential distance between oxidized cysteines , 2005, Bioinform..

[15]  Cheng-Yan Kao,et al.  Cysteine separations profiles on protein sequences infer disulfide connectivity , 2005, Bioinform..

[16]  Paolo Frasconi,et al.  A simplified approach to disulfide connectivity prediction from protein sequences , 2008, BMC Bioinformatics.

[17]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[18]  Shih-Chieh Chen,et al.  Prediction of disulfide connectivity in proteins with support vector machine , 2007 .

[20]  P Fariselli,et al.  Role of evolutionary information in predicting the disulfide‐bonding state of cysteine in proteins , 1999, Proteins.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Peter Clote,et al.  Disulfide connectivity prediction using secondary structure information and diresidue frequencies , 2005, Bioinform..

[23]  M. Sternberg,et al.  Analysis and classification of disulphide connectivity in proteins. The entropic effect of cross-linkage. , 1994, Journal of molecular biology.

[24]  Michael M. Parker,et al.  Prediction of cystine connectivity using SVM , 2005, Bioinformation.

[25]  P. Lyu,et al.  Relationship between protein structures and disulfide‐bonding patterns , 2003, Proteins.

[26]  Cheng-Yan Kao,et al.  Disulfide connectivity prediction with 70% accuracy using two‐level models , 2006, Proteins.

[27]  L A Mirny,et al.  How to derive a protein folding potential? A new approach to an old problem. , 1996, Journal of molecular biology.

[28]  Pierre Baldi,et al.  Large‐scale prediction of disulphide bridges using kernel methods, two‐dimensional recursive neural networks, and weighted graph matching , 2005, Proteins.

[29]  Jenn-Kang Hwang,et al.  Prediction of disulfide connectivity from protein sequences , 2005, Proteins.

[30]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.