Feature selection for protein dihedral angle prediction

Three-dimensional structure prediction has crucial importance for bioinformatics and theoretical chemistry. One of the main steps of three-dimensional structure prediction is dihedral (torsion) angle prediction. As new feature extraction methods are developed the dimension of the input space increases considerably yielding longer model training and less accurate models due to noisy or redundant features. In this study, feature selection is employed for dimensionality reduction on one of the established benchmarks of protein 1D structure prediction. Experimental results show that the feature selection improves the accuracy of protein dihedral angle class prediction by 2% and can eliminate up to %82 of the features when random forest classifier is used. Accurate prediction of dihedral angles will eventually contribute to protein structure prediction.

[1]  A. Bax,et al.  TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts , 2009, Journal of biomolecular NMR.

[2]  Eric Martz,et al.  Protein Data Bank (PDB) , 2004 .

[3]  An-Suei Yang,et al.  Protein backbone angle prediction with machine learning approaches , 2004, Bioinform..

[4]  Jeff A. Bilmes,et al.  Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure , 2011, BMC Bioinformatics.

[5]  Zafer Aydin,et al.  Constructing Structural Profiles for Protein Torsion Angle Prediction , 2015, BIOINFORMATICS.

[6]  Sitao Wu,et al.  ANGLOR: A Composite Machine-Learning Algorithm for Protein Backbone Torsion Angle Prediction , 2008, PloS one.

[7]  Ulrich H. E. Hansmann,et al.  Bioinformatics Original Paper Support Vector Machines for Prediction of Dihedral Angle Regions , 2022 .

[8]  Kai Ming Ting,et al.  Precision and Recall , 2017, Encyclopedia of Machine Learning and Data Mining.

[9]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[10]  Yaoqi Zhou,et al.  Real‐SPINE: An integrated system of neural networks for real‐value prediction of protein structural properties , 2007, Proteins.

[11]  William Stafford Noble,et al.  Protein Torsion Angle Class Prediction by a Hybrid Architecture of Bayesian and Neural Networks , 2012 .

[12]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[13]  Jonathan D. Hirst,et al.  Prediction of backbone dihedral angles and protein secondary structure using support vector machines , 2009, BMC Bioinformatics.

[14]  A. Bax,et al.  Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks , 2013, Journal of Biomolecular NMR.

[15]  Alessandro Vullo,et al.  Protein Structural Motif Prediction in Multidimensional ø-Psi Space Leads to Improved Secondary Structure Prediction , 2006, J. Comput. Biol..

[16]  Bin Xue,et al.  Real‐value prediction of backbone torsion angles , 2008, Proteins.

[17]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[18]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[19]  Manasi Patwardhan,et al.  EFFICIENT SPAM CLASSIFICATION BY APPROPRIATE FEATURE SELECTION , 2013 .

[20]  Yuedong Yang,et al.  Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. , 2009, Structure.