A novel Group Template Pattern Classifiers (GTPCs) method in protein secondary structure prediction

Protein secondary structure prediction is a fundamental problem and is much more challenging. In this study we propose a novel method called Group Template Pattern Classifiers (GTPCs), which captures the long-range interactions of the whole protein at a great degree. Firstly the protein dataset is divided into several groups based on the length of proteins. The proteins with “similar” length in each group are used to build the prediction model. Then Wavelets are used to extract the features of PSSM profile, and characterize the changing information of PSSM profile. A new protein can be predicted by the corresponding GTPCs model based on the new protein length. Experiments are performed on 25PDB, CB513, CASP9, CASP10, CASP11, and CASP12 datasets, and the good performance of 86.38%, 84.11%, 83.92%, 83.07%, 81.98%, and 82.35% is achieved respectively using 6 GTPCs model. Our experimental results are better than other state of the art methods.(http://qilubio.qlu.edu.cn /protein_GTPC)

[1]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[2]  Yihui Liu,et al.  Feature extraction of protein secondary structure using 2D convolutional neural network , 2016, 2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI).

[3]  Kuldip K. Paliwal,et al.  Sixty-five years of the long march in protein secondary structure prediction: the final stretch? , 2016, Briefings Bioinform..

[4]  Joarder Kamruzzaman,et al.  Combining segmental semi-Markov models with neural networks for protein secondary structure prediction , 2009, Neurocomputing.

[5]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[6]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[7]  Jian Peng,et al.  Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields , 2015, Scientific Reports.

[8]  Jianlin Cheng,et al.  Machine Learning Methods for Protein Structure Prediction , 2008, IEEE Reviews in Biomedical Engineering.

[9]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..

[11]  Bakhtiar Affendi Rosdi,et al.  FPGA-based hardware accelerator for the prediction of protein secondary class via fuzzy K-nearest neighbors with Lempel-Ziv complexity based distance measure , 2015, Neurocomputing.

[12]  Steven E. Brenner,et al.  SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..

[13]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[14]  Peixiang Cai,et al.  Prediction of protein secondary structure content using support vector machine. , 2007, Talanta.

[15]  Yihui Liu,et al.  Protein Secondary Structure Prediction based on Wavelets and 2D Convolutional Neural Network , 2016, CSBio.

[16]  Sumaiya Iqbal,et al.  A balanced secondary structure predictor. , 2016, Journal of theoretical biology.

[17]  Bingru Yang,et al.  Predicting protein secondary structure using a mixed-modal SVM method in a compound pyramid model , 2011, Knowl. Based Syst..

[18]  J. Gibrat,et al.  GOR method for predicting protein secondary structure from amino acid sequence. , 1996, Methods in enzymology.

[19]  Jian Zhou,et al.  Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction , 2014, ICML.

[20]  Changiz Eslahchi,et al.  Protein secondary structure prediction using three neural networks and a segmental semi Markov model. , 2009, Mathematical biosciences.

[21]  Xin-Qiu Yao,et al.  A dynamic Bayesian network approach to protein secondary structure prediction , 2008, BMC Bioinformatics.

[22]  Bingru Yang,et al.  Improving protein secondary structure prediction using a multi-modal BP method , 2011, Comput. Biol. Medicine.

[23]  Yihui Liu,et al.  Prediction of protein secondary structure using support vector machine with PSSM profiles , 2016, 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference.

[24]  Scott Dick,et al.  Classifier ensembles for protein structural class prediction with varying homology. , 2006, Biochemical and biophysical research communications.

[25]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[26]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[27]  C. Burrus,et al.  Introduction to Wavelets and Wavelet Transforms: A Primer , 1997 .

[28]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..