PSSP with dynamic weighted kernel fusion based on SVM-PHGS

Since 1960s, researchers have proposed several prediction methods, for protein secondary structure prediction (PSSP), whereas the accuracy of them is no more than 80%. In this case, there is an urgent need to introduce a high accuracy prediction method. One learning method called support vector machines (SVMs) has shown comparable or better results than neural networks on bioinformatics applications. This research proposes a method based on SVM which has been improved by a new parallel multi class (PMC) method, parallel hierarchical grid search (PHGS), cross validation (CV) technique and weighted kernel fusion (WKF) method. The presented PHGS has been applied to regularize parameters of SVM's kernel function which have an important impact on the accuracy. Using a suitable input data and kernel function for a particular problem can improve the prediction results remarkably. Also to improve our method, Position Scoring Matrix (PSSM) profiles are used as the input information to it. The goals of this study are to calibrate kernel function parameters and fusion of different kernel functions' result in order to determine protein secondary structure classes accurately. The right choice of a fusion method is an important issue in creating a supreme performance so we propose a dynamic weight allocation method based on a non-linear analysis system. The obtained classification accuracies of our method are 84.65% and 83.94% on RS126 and CB513 datasets respectively and they are very promising with regard to other classification methods in the literature for this problem. Also for evaluating our method behavior in comparison to other state of arts methods, an independent dataset is used. The results show that the comprehensibility of WKF based on SVM-PHGS is much better than other methods.

[1]  Lijun Wang,et al.  Improved Protein Secondary Structure Prediction Using a Intelligent HSVM Method with a New Encoding Scheme , 2011 .

[2]  Lukasz A. Kurgan,et al.  Critical assessment of high-throughput standalone methods for secondary structure prediction , 2011, Briefings Bioinform..

[3]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[4]  C Sander,et al.  Third generation prediction of secondary structures. , 2000, Methods in molecular biology.

[5]  Yücel Altunbasak,et al.  Protein secondary structure prediction for a single-sequence using hidden semi-Markov models , 2006, BMC Bioinformatics.

[6]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[7]  M. Sternberg,et al.  Prediction of protein secondary structure and active sites using the alignment of homologous sequences. , 1987, Journal of molecular biology.

[8]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[9]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[10]  T L Blundell,et al.  The use of amino acid patterns of classified helices and strands in secondary structure prediction. , 1996, Journal of molecular biology.

[11]  Marimuthu Palaniswami,et al.  Protein Secondary Structure Prediction Using Support Vector Machines and a New Feature Representation , 2006, Int. J. Comput. Intell. Appl..

[12]  Aleksey A. Porollo,et al.  Combining prediction of secondary structure and solvent accessibility in proteins , 2005, Proteins.

[13]  P. Argos,et al.  Seventy‐five percent accuracy in protein secondary structure prediction , 1997, Proteins.

[14]  D. Ruta,et al.  An Overview of Classifier Fusion Methods , 2000 .

[15]  J. Hirst,et al.  Protein secondary structure prediction with dihedral angles , 2005, Proteins.

[16]  G J Barton,et al.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction , 2000, Proteins.

[17]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[18]  Bingru Yang,et al.  Predicting protein secondary structure using a mixed-modal SVM method in a compound pyramid model , 2011, Knowl. Based Syst..

[19]  Hu Chen,et al.  A novel method for protein secondary structure prediction using dual‐layer SVM and profiles , 2004, Proteins.

[20]  R. King,et al.  Identification and application of the concepts important for accurate and reliable protein secondary structure prediction , 1996, Protein science : a publication of the Protein Society.

[21]  David S. Wishart,et al.  Improving the accuracy of protein secondary structure prediction using structural alignment , 2006, BMC Bioinformatics.

[22]  M Kanehisa A multivariate analysis method for discriminating protein secondary structural segments. , 1988, Protein engineering.

[23]  B. Robson,et al.  Conformational properties of amino acid residues in globular proteins. , 1976, Journal of molecular biology.

[24]  Anders Krogh,et al.  Improving Predicition of Protein Secondary Structure Using Structured Neural Networks and Multiple Sequence Alignments , 1996, J. Comput. Biol..

[25]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..

[26]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[27]  J M Chandonia,et al.  Neural networks for secondary structure and structural class predictions , 1995, Protein science : a publication of the Protein Society.

[28]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[29]  Jean-François Gibrat,et al.  Choosing the optimal hidden Markov model for secondary-structure prediction , 2005, IEEE Intelligent Systems.

[30]  S. Wodak,et al.  Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. , 1991, Journal of molecular biology.

[31]  George Karypis,et al.  YASSPP: Better kernels and coding schemes lead to improvements in protein secondary structure prediction , 2006, Proteins.

[32]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[33]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[34]  I. Kuntz,et al.  Tertiary Structure Prediction , 1989 .

[35]  Gérard Dreyfus,et al.  Single-layer learning revisited: a stepwise procedure for building and training a neural network , 1989, NATO Neurocomputing.

[36]  Jonathan D. Hirst,et al.  Prediction of backbone dihedral angles and protein secondary structure using support vector machines , 2009, BMC Bioinformatics.

[37]  G. Fasman Prediction of Protein Structure and the Principles of Protein Conformation , 2012, Springer US.

[38]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[39]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[40]  Yi Pan,et al.  Rule generation for protein secondary structure prediction with support vector machines and decision tree , 2006, IEEE Transactions on NanoBioscience.

[41]  Jagath C Rajapakse,et al.  Multi-class support vector machines for protein secondary structure prediction. , 2003, Genome informatics. International Conference on Genome Informatics.

[42]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[43]  Bingru Yang,et al.  Protein secondary structure prediction based on improved SVM method in compound pyramid model , 2010, 2010 Chinese Control and Decision Conference.

[44]  Burkhard Rost,et al.  The PredictProtein server , 2003, Nucleic Acids Res..

[45]  Juliette Martin,et al.  Analysis of an optimal hidden Markov model for secondary structure prediction , 2006, BMC Structural Biology.

[46]  Jagath C. Rajapakse,et al.  Two-Stage Multi-Class Support Vector Machines to Protein Secondary Structure Prediction , 2004, Pacific Symposium on Biocomputing.

[47]  Saraswathi Vishveshwara,et al.  PROTEIN STRUCTURE: INSIGHTS FROM GRAPH THEORY , 2002 .

[48]  Hyunsoo Kim,et al.  Protein secondary structure prediction based on an improved support vector machines approach. , 2003, Protein engineering.

[49]  F. Richards,et al.  Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure * , 1988, Proteins.

[50]  Kuang Lin,et al.  A simple and fast secondary structure prediction method using hidden neural networks , 2005, Bioinform..

[51]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[52]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[53]  A. Finkelstein,et al.  Theory of protein secondary structure and algorithm of its prediction , 1983, Biopolymers.

[54]  M Ouali,et al.  Cascaded multiple classifiers for secondary structure prediction , 2000, Protein science : a publication of the Protein Society.

[55]  Bingru Yang,et al.  The Research of Protein Secondary Structure Prediction System Based on KDTICM , 2009 .

[56]  S. Hua,et al.  A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. , 2001, Journal of molecular biology.

[57]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[58]  D. Mount Bioinformatics: Sequence and Genome Analysis , 2001 .

[59]  John P. Overington,et al.  The prediction and orientation of alpha-helices from sequence alignments: the combined use of environment-dependent substitution tables, Fourier transform methods and helix capping rules. , 1994, Protein engineering.

[60]  Yi Zhao,et al.  A protein secondary structure prediction framework based on the Extreme Learning Machine , 2008, Neurocomputing.

[61]  María S. Pérez-Hernández,et al.  Bayesian network multi-classifiers for protein secondary structure prediction , 2004, Artif. Intell. Medicine.

[62]  Yuedong Yang,et al.  Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. , 2009, Structure.

[63]  Yaoqi Zhou,et al.  Achieving 80% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training , 2006, Proteins.

[64]  Adam Prügel-Bennett,et al.  An evolutionary method for learning HMM structure: prediction of protein secondary structure , 2007, BMC Bioinformatics.

[65]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..