Support vector machines for the classification and prediction of β‐turn types

The support vector machines (SVMs) method is proposed because it can reflect the sequence‐coupling effect for a tetrapeptide in not only a β‐turn or non‐β‐turn, but also in different types of β‐turn. The results of the model for 6022 tetrapeptides indicate that the rates of self‐consistency for β‐turn types I, I′, II, II′, VI and VIII and non‐β‐turns are 99.92%, 96.8%, 98.02%, 97.75%, 100%, 97.19% and 100%, respectively. Using these training data, the rate of correct prediction by the SVMs for a given protein: rubredoxin (54 residues, 51 tetrapeptides) which includes 12 β‐turn type I tetrapeptides, 1 β‐turn type II tetrapeptide and 38 non‐β‐turns reached 82.4%. The high quality of prediction of the SVMs implies that the formation of different β‐turn types or non‐β‐turns is considerably correlated with the sequence of a tetrapeptide. The SVMs can save CPU time and avoid the overfitting problem compared with the neural network method. Copyright © 2002 European Peptide Society and John Wiley & Sons, Ltd.

[1]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  R. M. Abarbanel,et al.  Turn prediction in proteins using a pattern-matching approach. , 1986, Biochemistry.

[6]  K. Chou,et al.  Classification and Prediction of β-Turn Types , 1997, Journal of protein chemistry.

[7]  Bernard F. Buxton,et al.  Drug Design by Machine Learning: Support Vector Machines for Pharmaceutical Data Analysis , 2001, Comput. Chem..

[8]  Kuo-Chen Chou,et al.  Classification and prediction of ߖturn types by neural network , 1999 .

[9]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[10]  G. Rose,et al.  Turns in peptides and proteins. , 1985, Advances in protein chemistry.

[11]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[12]  J. Thornton,et al.  A revised set of potentials for β‐turn formation in proteins , 1994 .

[13]  Malcolm J. McGregor,et al.  Prediction of β-turns in proteins using neural networks , 1989 .

[14]  H. Scheraga,et al.  Chain reversals in proteins. , 1973, Biochimica et biophysica acta.

[15]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[16]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[17]  J. Thornton,et al.  Analysis and prediction of the different types of β-turn in proteins , 1988 .

[18]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .