Quadratic programming for class ordering in rule induction

Separate-and-conquer type rule induction algorithms such as Ripper, solve a K > 2 class problem by converting it into a sequence of K − 1 two-class problems. As a usual heuristic, the classes are fed into the algorithm in the order of increasing prior probabilities. Although the heuristic works well in practice, there is much room for improvement. In this paper, we propose a novel approach to improve this heuristic. The approach transforms the ordering search problem into a quadratic optimization problem and uses the solution of the optimization problem to extract the optimal ordering. We compared new Ripper (guided by the ordering found with our approach) with original Ripper (guided by the heuristic ordering) on 27 datasets. Simulation results show that our approach produces rulesets that are significantly better than those produced by the original Ripper.

[1]  Robert Tibshirani,et al.  Margin Trees for High-dimensional Classification , 2007, J. Mach. Learn. Res..

[2]  Thomas Villmann,et al.  Rule Extraction from Self-Organizing Networks , 2002, ICANN.

[3]  Ethem Alpaydin,et al.  Linear Discriminant Trees , 2000, ICML.

[4]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[5]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[6]  David J. Hand,et al.  Intelligent Data Analysis: An Introduction , 2005 .

[7]  H. Theron,et al.  BEXA: A Covering Algorithm for Learning Propositional Concept Descriptions , 1996, Machine Learning.

[8]  Separate-and-Conquer Learning , 2017, Encyclopedia of Machine Learning and Data Mining.

[9]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[10]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[11]  Gilles Venturini,et al.  SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts , 1993, ECML.

[12]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[13]  K. Bennett,et al.  A support vector machine approach to decision trees , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[14]  Sebastián Ventura,et al.  Using Ant Programming Guided by Grammar for Building Rule-Based Classifiers , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Jo Ao Gama Discriminant Trees , 1999 .

[16]  Saul B. Gelfand,et al.  Classification trees with neural network feature extraction , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Olcay Taner Yildiz,et al.  Searching for the optimal ordering of classes in rule induction , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  Rajen B. Bhatt,et al.  FRCT: fuzzy-rough classification trees , 2007, Pattern Analysis and Applications.

[20]  Prasad Tadepalli,et al.  Learning Decision Rules by Randomized Iterative Local Search , 2002, ICML.

[21]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[22]  Lukasz A. Kurgan,et al.  Highly scalable and robust rule learner: performance evaluation and comparison , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[23]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[24]  Yi-Chung Hu,et al.  Finding fuzzy classification rules using data mining techniques , 2003, Pattern Recognit. Lett..

[25]  Qiang Shen,et al.  A rough-fuzzy approach for generating classification rules , 2002, Pattern Recognit..

[26]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.