Protein secondary structure prediction based on two dimensional deep convolutional neural networks

The highest three-state prediction accuracy of protein secondary structure is now at 82–84% without using structure templates, approaching to the theoretical limit 88–90%. Increasingly larger training datasets cover more protein sequences and structures. More powerful deep learning techniques are not only able to deal with the computation load of large data, but also can capture the long-range interactions of protein sequence. In this research, we propose a new approach to design a two dimensional deep convolutional neural networks (2DCNN) with 6 convolutional layers and 5 max-pooling layers. The two dimensional convolutional neural networks keep original amino acid sequence position information based on two dimensional input matrix, and extract features of the sequence interactions better. The performance of our prediction model 2DCNN is 83.09%, 81.74%, 82.41%, 83.56%, 81.16%, and 80.30% for 25PDB, CB513, CASP9, CASP10, CASP11, and CASP12 datasets. Our prediction model achieves better results compared to most state of the art methods. (http://qilubio.qlu.edu.cn/protein)

[1]  Anant Madabhushi,et al.  A Deep Convolutional Neural Network for segmenting and classifying epithelial and stromal regions in histopathological images , 2016, Neurocomputing.

[2]  Bingru Yang,et al.  Improving protein secondary structure prediction using a multi-modal BP method , 2011, Comput. Biol. Medicine.

[3]  Mao Ye,et al.  Fast crowd density estimation with convolutional neural networks , 2015, Eng. Appl. Artif. Intell..

[4]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[5]  Xin-Qiu Yao,et al.  A dynamic Bayesian network approach to protein secondary structure prediction , 2008, BMC Bioinformatics.

[6]  Peixiang Cai,et al.  Prediction of protein secondary structure content using support vector machine. , 2007, Talanta.

[7]  Yihui Liu,et al.  Protein Secondary Structure Prediction based on Wavelets and 2D Convolutional Neural Network , 2016, CSBio.

[8]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[9]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..

[10]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[11]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[12]  Scott Dick,et al.  Classifier ensembles for protein structural class prediction with varying homology. , 2006, Biochemical and biophysical research communications.

[13]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[14]  Yihui Liu,et al.  Feature extraction of protein secondary structure using 2D convolutional neural network , 2016, 2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI).

[15]  Sumaiya Iqbal,et al.  A balanced secondary structure predictor. , 2016, Journal of theoretical biology.

[16]  Bakhtiar Affendi Rosdi,et al.  FPGA-based hardware accelerator for the prediction of protein secondary class via fuzzy K-nearest neighbors with Lempel-Ziv complexity based distance measure , 2015, Neurocomputing.

[17]  J. Gibrat,et al.  GOR method for predicting protein secondary structure from amino acid sequence. , 1996, Methods in enzymology.

[18]  Jian Zhou,et al.  Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction , 2014, ICML.

[19]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[20]  Jianlin Cheng,et al.  Machine Learning Methods for Protein Structure Prediction , 2008, IEEE Reviews in Biomedical Engineering.

[21]  Jun Guo,et al.  An empirical convolutional neural network approach for semantic relation classification , 2016, Neurocomputing.

[22]  Changiz Eslahchi,et al.  Protein secondary structure prediction using three neural networks and a segmental semi Markov model. , 2009, Mathematical biosciences.

[23]  Kuldip K. Paliwal,et al.  Sixty-five years of the long march in protein secondary structure prediction: the final stretch? , 2016, Briefings Bioinform..

[24]  Joarder Kamruzzaman,et al.  Combining segmental semi-Markov models with neural networks for protein secondary structure prediction , 2009, Neurocomputing.

[25]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[26]  Jian Peng,et al.  Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields , 2015, Scientific Reports.

[27]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Bingru Yang,et al.  Predicting protein secondary structure using a mixed-modal SVM method in a compound pyramid model , 2011, Knowl. Based Syst..

[29]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.