Rule Extraction from Support Vector Machines: A Sequential Covering Approach

In this paper, we propose a novel algorithm for rule extraction from support vector machines (SVMs), termed SQRex-SVM. The proposed method extracts rules directly from the support vectors (SVs) of a trained SVM using a modified sequential covering algorithm. Rules are generated based on an ordered search of the most discriminative features, as measured by interclass separation. Rule performance is then evaluated using measured rates of true and false positives and the area under the receiver operating characteristic (ROC) curve (AUC). Results are presented on a number of commonly used data sets that show the rules produced by SQRex-SVM exhibit both improved generalization performance and smaller more comprehensible rule sets compared to both other SVM rule extraction techniques and direct rule learning techniques.

[1]  Xiuju Fu,et al.  Extracting the knowledge embedded in support vector machines , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[2]  Robert C. Holte,et al.  Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[3]  Glenn Fung,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[4]  Andrew P. Bradley,et al.  Sample size estimation using the receiver operating characteristic curve , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[6]  Tom Fawcett,et al.  Using rule sets to maximize ROC performance , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[7]  Joachim Diederich,et al.  Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[8]  Joachim Diederich,et al.  Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets , 2004 .

[9]  Peter A. Flach,et al.  A Response to Webb and Ting’s On the Application of ROC Analysis to Predict Classification Performance Under Varying Class Distributions , 2005, Machine Learning.

[10]  Johannes Fürnkranz,et al.  Pruning Algorithms for Rule Learning , 1997, Machine Learning.

[11]  Ying Zhang,et al.  Rule Extraction from Trained Support Vector Machines , 2005, PAKDD.

[12]  Ian Witten,et al.  Data Mining , 2000 .

[13]  Andreu Català,et al.  Rule extraction from support vector machines , 2002, ESANN.

[14]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[15]  Geoffrey I. Webb,et al.  On the Application of ROC Analysis to Predict Classification Performance Under Varying Class Distributions , 2005, Machine Learning.

[16]  Jim Esch Computational Intelligence Methods For Rule-Based Data Understanding , 2004, Proc. IEEE.

[17]  Yoichi Hayashi,et al.  Computational intelligence methods and data understanding , 2001 .

[18]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[20]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[21]  Joachim Diederich,et al.  Rule Extraction from Support Vector Machines , 2008, Studies in Computational Intelligence.

[22]  Johannes Fürnkranz,et al.  ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[23]  Yang Zhang,et al.  DRC-BK: Mining Classification Rules with Help of SVM , 2004, PAKDD.

[24]  Andrew P. Bradley,et al.  Rule Extraction from Support Vector Machines: Measuring the Explanation Capability Using the Area under the ROC Curve , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[25]  Robert P. W. Duin,et al.  Precision-recall operating characteristic (P-ROC) curves in imprecise environments , 2006, 18th International Conference on Pattern Recognition (ICPR'06).