Artificial bee colony-based support vector machines with feature selection and parameter optimization for rule extraction

Support vector machine (SVM) is a state-of-art classification tool with good accuracy due to its ability to generate nonlinear model. However, the nonlinear models generated are typically regarded as incomprehensible black-box models. This lack of explanatory ability is a serious problem for practical SVM applications which require comprehensibility. Therefore, this study applies a C5 decision tree (DT) to extract rules from SVM result. In addition, a metaheuristic algorithm is employed for the feature selection. Both SVM and C5 DT require expensive computation. Applying these two algorithms simultaneously for high-dimensional data will increase the computational cost. This study applies artificial bee colony optimization (ABC) algorithm to select the important features. The proposed algorithm ABC–SVM–DT is applied to extract comprehensible rules from SVMs. The ABC algorithm is applied to implement feature selection and parameter optimization before SVM–DT. The proposed algorithm is evaluated using eight datasets to demonstrate the effectiveness of the proposed algorithm. The result shows that the classification accuracy and complexity of the final decision tree can be improved simultaneously by the proposed ABC–SVM–DT algorithm, compared with genetic algorithm and particle swarm optimization algorithm.

[1]  Ingo Wegener,et al.  Real royal road functions--where crossover provably is essential , 2001, Discret. Appl. Math..

[2]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[3]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[4]  John R. Jensen,et al.  A change detection model based on neighborhood correlation image analysis and decision tree classification , 2005 .

[5]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[6]  Steven M. LaValle,et al.  On the Relationship between Classical Grid Search and Probabilistic Roadmaps , 2004, Int. J. Robotics Res..

[7]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[8]  Adel Sabry Eesa,et al.  A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems , 2015, Expert Syst. Appl..

[9]  Jon Atli Benediktsson,et al.  Fusion of Support Vector Machines for Classification of Multisensor Data , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Erwie Zahara,et al.  A hybrid genetic algorithm and particle swarm optimization for multimodal functions , 2008, Appl. Soft Comput..

[11]  G. P. S. Varma,et al.  Pixel-Based Classification Using Support Vector Machine Classifier , 2016, 2016 IEEE 6th International Conference on Advanced Computing (IACC).

[12]  Hsuan-Tien Lin A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods , 2005 .

[13]  Nagiza F. Samatova,et al.  An SVM-based algorithm for identification of photosynthesis-specific genome features , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[14]  Joachim Diederich,et al.  Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets , 2004 .

[15]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[16]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[17]  Dervis Karaboga,et al.  A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm , 2007, J. Glob. Optim..

[18]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[19]  Vasant G Honavar,et al.  Feature Subset Selection Using a Genetic Algorithm Feature Subset Selection Using a Genetic Algorithm , 1998 .

[20]  Thierry Denoeux,et al.  An evidential classifier based on feature selection and two-step classification strategy , 2015, Pattern Recognit..

[21]  Yi Pan,et al.  Current Methods for Protein Secondary‐Structure Prediction Based on Support Vector Machines , 2007 .

[22]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[23]  Hussain Shareef,et al.  An application of artificial bee colony algorithm with least squares support vector machine for real and reactive power tracing in deregulated power system , 2012 .

[24]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Glenn Fung,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[27]  Athanasios V. Vasilakos,et al.  Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data , 2016, IEEE Transactions on Services Computing.

[28]  Adam Prügel-Bennett,et al.  Benefits of a Population: Five Mechanisms That Advantage Population-Based Algorithms , 2010, IEEE Transactions on Evolutionary Computation.

[29]  Johan A. K. Suykens,et al.  Financial time series prediction using least squares support vector machines within the evidence framework , 2001, IEEE Trans. Neural Networks.

[30]  Benjamin Naumann,et al.  Learning And Soft Computing Support Vector Machines Neural Networks And Fuzzy Logic Models , 2016 .

[31]  Dervis Karaboga,et al.  A modified Artificial Bee Colony (ABC) algorithm for constrained optimization problems , 2011, Appl. Soft Comput..

[32]  Geoffrey I. Webb,et al.  Advances in Knowledge Discovery and Data Mining , 2018, Lecture Notes in Computer Science.

[33]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[34]  Dervis Karaboga,et al.  A comparative study of Artificial Bee Colony algorithm , 2009, Appl. Math. Comput..

[35]  Tom Downs,et al.  Exact Simplification of Support Vector Solutions , 2002, J. Mach. Learn. Res..

[36]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications , 2007, Series in Machine Perception and Artificial Intelligence.

[37]  José Sergio Ruiz Castilla,et al.  Data selection based on decision tree for SVM classification on large data sets , 2015, Appl. Soft Comput..

[38]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[39]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[40]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[41]  Bernhard Schölkopf,et al.  Feature selection for support vector machines by means of genetic algorithm , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[42]  Karim Jerbi,et al.  Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines , 2015, Journal of Neuroscience Methods.

[43]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[44]  Andrew P. Bradley,et al.  Rule extraction from support vector machines: A review , 2010, Neurocomputing.

[45]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[46]  D. Karaboga,et al.  On the performance of artificial bee colony (ABC) algorithm , 2008, Appl. Soft Comput..

[47]  Chih-Jen Lin,et al.  A Simple Decomposition Method for Support Vector Machines , 2002, Machine Learning.

[48]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[49]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[50]  E Brown de Colstoun,et al.  National Park vegetation mapping using multitemporal Landsat 7 data and a decision tree classifier , 2003 .

[51]  R. J. Kuo,et al.  Hybrid particle swarm optimization with genetic algorithm for solving capacitated vehicle routing problem with fuzzy demand - A case study on garbage collection system , 2012, Appl. Math. Comput..

[52]  Zbigniew Michalewicz,et al.  Parameter Control in Evolutionary Algorithms , 2007, Parameter Setting in Evolutionary Algorithms.

[53]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[54]  Jun Wang,et al.  A real time IDSs based on artificial Bee Colony-support vector machine algorithm , 2010, Third International Workshop on Advanced Computational Intelligence.

[55]  D. DouglasE.Torres,et al.  Extracting trees from trained SVM models using a TREPAN based approach , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[56]  Roberto Basili,et al.  Semantic Role Labeling via Tree Kernel Joint Inference , 2006, CoNLL.

[57]  Andreas Holzinger,et al.  Data Mining with Decision Trees: Theory and Applications , 2015, Online Inf. Rev..

[58]  K. I. Ramachandran,et al.  Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing , 2007 .

[59]  Carsten Witt,et al.  Population size versus runtime of a simple evolutionary algorithm , 2008, Theor. Comput. Sci..

[60]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[61]  Johan A. K. Suykens,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2004, Machine Learning.

[62]  Farhad Samadzadegan,et al.  CLUSTERING OF LIDAR DATA USING PARTICLE SWARM OPTIMIZATION ALGORITHM IN URBAN AREA , 2009 .