Rule extraction from support vector machines: a hybrid approach for solving classification and regression problems

In this paper, a novel hybrid approach to extract rules from support vector machine and support vector regression (SVM/SVR) is presented. The hybrid has three phases: 1) SVM-recursive feature elimination (RFE) algorithm is employed for feature selection; 2) using the selected features, SVM/SVR models are built and the actual target values of the training instances are replaced by the predictions obtained from these models resulting in a modified training set; 3) the modified training set is used for rule generation using decision tree (DT), classification and regression tree (CART), adaptive network based fuzzy inference system (ANFIS) and dynamic evolving fuzzy inference system (DENFIS). Extensive experiments are conducted on three benchmark classification problems, four bank bankruptcy prediction problems and five benchmark regression problems. We conclude that the rules obtained after feature selection perform comparably to those extracted from all features. Further, comprehensibility is also improved after feature selection.

[1]  N. Speybroeck Classification and regression trees , 2012, International Journal of Public Health.

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Vadlamani Ravi,et al.  Data Mining Using Rules Extracted from SVM: An Application to Churn Prediction in Bank Credit Cards , 2009, RSFDGrC.

[4]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  V. Ravi,et al.  Rule extraction using Support Vector Machine based hybrid classifier , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[6]  J. Diederich Rule Extraction from Support Vector Machines , 2008, Studies in Computational Intelligence.

[7]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[8]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[9]  Andrew P. Bradley,et al.  Rule Extraction from Support Vector Machines: A Sequential Covering Approach , 2007, IEEE Transactions on Knowledge and Data Engineering.

[10]  Bart Baesens,et al.  Risk Management and Regulatory Compliance: A Data Mining Framework Based on Neural Network Rule Extraction , 2006, ICIS.

[11]  Nahla H. Barakat,et al.  Rule Extraction from Support Vector Machines: Measuring the Explanation Capability Using the Area under the ROC Curve , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[12]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[13]  Shukuan Lin,et al.  Time Series Prediction Based on Support Vector Regression , 2006 .

[14]  Ricardo Tanscheit,et al.  Fuzzy rule extraction from support vector machines , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[15]  Serpil Canbas,et al.  Prediction of commercial bank failure via multivariate statistical analysis of financial structures: The Turkish case , 2005, Eur. J. Oper. Res..

[16]  Chia-Hui Ho,et al.  An Improved Support Vector Regression Modeling for Taiwan Stock Exchange Market Weighted Index Forecasting , 2005, 2005 International Conference on Neural Networks and Brain.

[17]  S. Sandilya,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[18]  Ying Zhang,et al.  Rule Extraction from Trained Support Vector Machines , 2005, PAKDD.

[19]  Joachim Diederich,et al.  Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[20]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[21]  Xiuju Fu,et al.  Extracting the knowledge embedded in support vector machines , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[22]  Y. Moses,et al.  Real time facial expression recognition in video using support vector machines , 2003, ICMI '03.

[23]  Francis Eng Hock Tay,et al.  Support vector machine with adaptive parameters in financial time series forecasting , 2003, IEEE Trans. Neural Networks.

[24]  Simon J. Perkins,et al.  Genetic Algorithms and Support Vector Machines for Time Series Classification , 2002, Optics + Photonics.

[25]  Nikola K. Kasabov,et al.  DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction , 2002, IEEE Trans. Fuzzy Syst..

[26]  M. Beynon,et al.  Variable precision rough set theory and data discretisation: an application to corporate failure prediction , 2001 .

[27]  Tong Zhang An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods , 2001, AI Mag..

[28]  N. Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[29]  T. Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1999, ECML.

[30]  J. Weston,et al.  Support vector regression with ANOVA decomposition kernels , 1999 .

[31]  Simon Haykin,et al.  Support vector machines for dynamic reconstruction of a chaotic system , 1999 .

[32]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[33]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[34]  Ignacio Olmeda,et al.  Hybrid Classifiers for Financial Multicriteria Decision Making: The Case of Bankruptcy Prediction , 1997 .

[35]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[36]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[37]  Bernhard Schölkopf,et al.  Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[38]  Mark W. Craven,et al.  Extracting Tree-Structured Representations of Trained Networks , 1995, NIPS.

[39]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[40]  Bernhard Schölkopf,et al.  Extracting Support Data for a Given Task , 1995, KDD.

[41]  Jude W. Shavlik,et al.  Using Sampling and Queries to Extract Rules from Trained Neural Networks , 1994, ICML.

[42]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[43]  Isabelle Guyon,et al.  Automatic Capacity Tuning of Very Large VC-Dimension Classifiers , 1992, NIPS.

[44]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[45]  B. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[46]  Stephen I. Gallant,et al.  Connectionist expert systems , 1988, CACM.

[47]  Madhuri Jha ANN-DT : An Algorithm for Extraction of Decision Trees from Artificial Neural Networks , 2013 .

[48]  Vadlamani Ravi,et al.  Support Vector Machine based Hybrid Classifiers and Rule Extraction thereof: Application to Bankruptcy Prediction in Banks , 2010 .

[49]  Andreu Català,et al.  Rule Based Learning Systems from SVM and RBFNN , 2004 .

[50]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[51]  Nahla H. Barakat,et al.  Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets , 2004 .

[52]  F. Tay,et al.  e-Descending Support Vector Machines for Financial Time Series Forecasting , 2002 .

[53]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[54]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[55]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[56]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[57]  Nils J. Nilsson,et al.  Learning Machines: Foundations of Trainable Pattern-Classifying Systems , 1965 .

[58]  Peter F. Fisher,et al.  Fuzzy sets , 1965 .

[59]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .