BhairPred: prediction of β-hairpins in a protein from multiple alignment information using ANN and SVM techniques

This paper describes a method for predicting a supersecondary structural motif, β-hairpins, in a protein sequence. The method was trained and tested on a set of 5102 hairpins and 5131 non-hairpins, obtained from a non-redundant dataset of 2880 proteins using the DSSP and PROMOTIF programs. Two machine-learning techniques, an artificial neural network (ANN) and a support vector machine (SVM), were used to predict β-hairpins. An accuracy of 65.5% was achieved using ANN when an amino acid sequence was used as the input. The accuracy improved from 65.5 to 69.1% when evolutionary information (PSI-BLAST profile), observed secondary structure and surface accessibility were used as the inputs. The accuracy of the method further improved from 69.1 to 79.2% when the SVM was used for classification instead of the ANN. The performances of the methods developed were assessed in a test case, where predicted secondary structure and surface accessibility were used instead of the observed structure. The highest accuracy achieved by the SVM based method in the test case was 77.9%. A maximum accuracy of 71.1% with Matthew's correlation coefficient of 0.41 in the test case was obtained on a dataset previously used by X. Cruz, E. G. Hutchinson, A. Shephard and J. M. Thornton (2002) Proc. Natl Acad. Sci. USA, 99, 11157–11162. The performance of the method was also evaluated on proteins used in the ‘6th community-wide experiment on the critical assessment of techniques for protein structure prediction (CASP6)’. Based on the algorithm described, a web server, BhairPred (), has been developed, which can be used to predict β-hairpins in a protein using the SVM approach.

[1]  Janet M. Thornton,et al.  Toward predicting protein topology: An approach to identifying β hairpins , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Gajendra P. S. Raghava,et al.  Prediction of α‐turns in proteins using PSI‐BLAST profiles and secondary structure information , 2004 .

[3]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[4]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[5]  J. Thornton,et al.  PROMOTIF—A program to identify and analyze structural motifs in proteins , 1996, Protein science : a publication of the Protein Society.

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[7]  Gajendra P.S. Raghava,et al.  Prediction of alpha-turns in proteins using PSI-BLAST profiles and secondary structure information. , 2004, Proteins.

[8]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[9]  O. Lund,et al.  Prediction of protein secondary structure at 80% accuracy , 2000, Proteins.

[10]  Jens Meiler,et al.  Strand‐loop‐strand motifs: Prediction of hairpins and diverging turns in proteins , 2004, Proteins.

[11]  D Xu,et al.  Prediction of protein supersecondary structures based on the artificial neural network method. , 1997, Protein engineering.

[12]  Gajendra P. S. Raghava,et al.  BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms , 2002, Bioinform..

[13]  Geoffrey E. Hinton,et al.  Learning representations of back-propagation errors , 1986 .

[14]  D. T. Jones,et al.  Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure , 1999, Proteins.

[15]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[16]  Gajendra P. S. Raghava,et al.  Bteval: a Server for Evaluation of beta-turn Prediction Methods , 2003, J. Bioinform. Comput. Biol..

[17]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[18]  Gajendra P. S. Raghava,et al.  A neural network method for prediction of ?-turn types in proteins using evolutionary information , 2004, Bioinform..

[19]  Gajendra Pal Singh Raghava,et al.  Prediction of β‐turns in proteins from multiple alignment using neural network , 2003, Protein science : a publication of the Protein Society.

[20]  Shandar Ahmad,et al.  NETASA: neural network based prediction of solvent accessibility , 2002, Bioinform..

[21]  Gajendra P. S. Raghava,et al.  A neural‐network based method for prediction of γ‐turns in proteins from multiple sequence alignment , 2003 .

[22]  Gajendra P. S. Raghava,et al.  An evaluation of ß-turn prediction methods , 2002, Bioinform..

[23]  G. P. S. Raghava,et al.  BetaTPred : prediction of β-TURNS in a protein using statistical algorithms , 2002 .