Predicting the Types of Ion Channel-Targeted Conotoxins Based on AVC-SVM Model

The conotoxin proteins are disulfide-rich small peptides. Predicting the types of ion channel-targeted conotoxins has great value in the treatment of chronic diseases, epilepsy, and cardiovascular diseases. To solve the problem of information redundancy existing when using current methods, a new model is presented to predict the types of ion channel-targeted conotoxins based on AVC (Analysis of Variance and Correlation) and SVM (Support Vector Machine). First, the F value is used to measure the significance level of the feature for the result, and the attribute with smaller F value is filtered by rough selection. Secondly, redundancy degree is calculated by Pearson Correlation Coefficient. And the threshold is set to filter attributes with weak independence to get the result of the refinement. Finally, SVM is used to predict the types of ion channel-targeted conotoxins. The experimental results show the proposed AVC-SVM model reaches an overall accuracy of 91.98%, an average accuracy of 92.17%, and the total number of parameters of 68. The proposed model provides highly useful information for further experimental research. The prediction model will be accessed free of charge at our web server.

[1]  Norman D. Black,et al.  An optimization of ReliefF for classification in large datasets , 2009, Data Knowl. Eng..

[2]  Arthur de Miranda Neto,et al.  Pearson's Correlation Coefficient: A More Realistic Threshold for Applications on Autonomous Robotics , 2014 .

[3]  Yunming Ye,et al.  ForesTexter: An efficient random forest algorithm for imbalanced text categorization , 2014, Knowl. Based Syst..

[4]  Wei Chen,et al.  Predicting cancerlectins by the optimal g-gap dipeptides , 2015, Scientific Reports.

[5]  Hui Ding,et al.  Prediction of the types of ion channel-targeted conotoxins based on radial basis function network. , 2013, Toxicology in vitro : an international journal published in association with BIBRA.

[6]  Bernt Schiele,et al.  Learning using privileged information: SV M+ and weighted SVM , 2013, Neural Networks.

[7]  Mahmoud Moallem,et al.  Predicting students' grades using fuzzy non-parametric regression method and ReliefF-based algorithm , 2014 .

[8]  Danwei Wang,et al.  Sparse Extreme Learning Machine for Classification , 2014, IEEE Transactions on Cybernetics.

[9]  Jiuyong Li,et al.  DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins. , 2016, Drug discovery today.

[10]  Dejan Gjorgjevikj,et al.  A Multi-class SVM Classifier Utilizing Binary Decision Tree , 2009, Informatica.

[11]  Dae-Ki Kang,et al.  Experimental analysis of naïve Bayes classifier based on an attribute weighting framework with smooth kernel density estimations , 2015, Applied Intelligence.

[12]  Hui Ding,et al.  Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. , 2011, Journal of theoretical biology.

[13]  Francisco Herrera,et al.  DRCW-OVO: Distance-based relative competence weighting combination for One-vs-One strategy in multi-class problems , 2015, Pattern Recognit..

[14]  Sukanta Mondal,et al.  Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification. , 2006, Journal of theoretical biology.

[15]  Marti J. Anderson,et al.  Permutation tests for multi-factorial analysis of variance , 2003 .

[16]  Hua Tang,et al.  Identifying the Types of Ion Channel-Targeted Conotoxins by Incorporating New Properties of Residues into Pseudo Amino Acid Composition , 2016, BioMed research international.

[17]  Bor-Chen Kuo,et al.  A Kernel-Based Feature Selection Method for SVM With RBF Kernel for Hyperspectral Image Classification , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[18]  Hao Lin,et al.  Predicting conotoxin superfamily and family by using pseudo amino acid composition and modified Mahalanobis discriminant. , 2007, Biochemical and biophysical research communications.

[19]  Lei Shi,et al.  Inference for mixed models of ANOVA type with high-dimensional data , 2015, J. Multivar. Anal..

[20]  M. Williams,et al.  Structure and functional expression of an omega-conotoxin-sensitive human N-type calcium channel. , 1992, Science.

[21]  M. Ebrahimi,et al.  Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology. , 2014, Journal of theoretical biology.

[22]  K. V. Arya,et al.  Feature selection and classification of leukocytes using random forest , 2014, Medical & Biological Engineering & Computing.

[23]  Raymond Chiong,et al.  Forecasting interval time series using a fully complex-valued RBF neural network with DPSO and PSO algorithms , 2015, Inf. Sci..

[24]  Raymond S. Norton,et al.  Conotoxin Gene Superfamilies , 2014, Marine drugs.

[25]  Wei Chen,et al.  Prediction of thermophilic proteins using feature selection technique. , 2011, Journal of microbiological methods.

[26]  MARTI J. ANDERSONa,et al.  PERMUTATION TESTS FOR MULTIFACTORIAL ANALYSIS OF VARIANCE , 2008 .

[27]  Jian Huang,et al.  Prediction of Golgi-resident protein types by using feature selection technique , 2013 .

[28]  Mykola Pechenizkiy,et al.  ReliefF-MI: An extension of ReliefF to multiple instance learning , 2012, Neurocomputing.

[29]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2008, BMC Genomics.

[30]  Wei Chen,et al.  Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis. , 2014, Molecular bioSystems.

[31]  Hao Lin,et al.  Prediction of subcellular location of mycobacterial protein using feature selection techniques , 2010, Molecular Diversity.

[32]  H. Ding,et al.  Identification of mitochondrial proteins of malaria parasite using analysis of variance , 2014, Amino Acids.

[33]  Neveen Mohamed Kilany,et al.  Estimation in mixed-effects functional ANOVA models , 2015, J. Multivar. Anal..

[34]  Feiping Nie,et al.  Multiple rank multi-linear SVM for matrix data classification , 2014, Pattern Recognit..

[35]  Yang Zehong,et al.  Combination Feature Selection Based on Relief , 2004 .

[36]  V. J. DeGhett Effective use of Pearson's product-moment correlation coefficient: an additional point , 2014, Animal Behaviour.

[37]  K. Chou,et al.  iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels , 2014, BioMed research international.