Classification Rule Mining with Iterated Greedy

In the context of data mining, classification rule discovering is the task of designing accurate rule based systems that model the useful knowledge that differentiate some data classes from others, and is present in large data sets. Iterated greedy search is a powerful metaheuristic, successfully applied to different optimisation problems, which to our knowledge, has not previously been used for classification rule mining. In this work, we analyse the convenience of using iterated greedy algorithms for the design of rule classification systems. We present and study different alternatives and compare the results with state-of-the-art methodologies from the literature. The results show that iterated greedy search may generate accurate rule classification systems with acceptable interpretability levels.

[1]  Jaume Bacardit,et al.  Performance and Efficiency of Memetic Pittsburgh Learning Classifier Systems , 2009, Evolutionary Computation.

[2]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[3]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Ian Witten,et al.  Data Mining , 2000 .

[6]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[7]  Sebastián Ventura,et al.  An interpretable classification rule mining algorithm , 2013, Inf. Sci..

[8]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[9]  Christian Blum,et al.  An iterated greedy algorithm for the large-scale unrelated parallel machines scheduling problem , 2013, Comput. Oper. Res..

[10]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[11]  Francisco Herrera,et al.  A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 Special Session on Real Parameter Optimization , 2009, J. Heuristics.

[12]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[13]  Feng Luo,et al.  Exploring the k-colorable landscape with Iterated Greedy , 1993, Cliques, Coloring, and Satisfiability.

[14]  Francisco Herrera,et al.  Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability , 2007, Data Knowl. Eng..

[15]  Sebastián Ventura,et al.  Multi-instance genetic programming for predicting student performance in web based educational environments , 2012, Appl. Soft Comput..

[16]  Sebastián Ventura,et al.  Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules , 2011, Knowledge and Information Systems.

[17]  Deborah R. Carvalho,et al.  A hybrid decision tree/genetic algorithm method for data mining , 2004, Inf. Sci..

[18]  Debbie Richards,et al.  Two decades of Ripple Down Rules research , 2009, The Knowledge Engineering Review.

[19]  Antonio González Muñoz,et al.  Table Ii Tc Pattern Recognition Result for 120 Eir Satellite Image Cases Selection of Relevant Features in a Fuzzy Genetic Learning Algorithm , 2001 .

[20]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[21]  Kuo-Ching Ying,et al.  Dynamic parallel machine scheduling with sequence-dependent setup times using an iterated greedy heuristic , 2010, Expert Syst. Appl..

[22]  Kay Chen Tan,et al.  A coevolutionary algorithm for rules discovery in data mining , 2006, Int. J. Syst. Sci..

[23]  Detlef D. Nauck,et al.  Measuring interpretability in rule-based classification systems , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[24]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[25]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[26]  Thomas Stützle,et al.  A simple and effective iterated greedy algorithm for the permutation flowshop scheduling problem , 2007, Eur. J. Oper. Res..

[27]  Steven Guan,et al.  An incremental approach to genetic-algorithms-based classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  Carlos García-Martínez,et al.  Iterated greedy for the maximum diversity problem , 2011, Eur. J. Oper. Res..

[29]  Inés Couso,et al.  Combining GP operators with SA search to evolve fuzzy rule based classifiers , 2001, Inf. Sci..

[30]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[31]  José Hernández-Orallo,et al.  An experimental comparison of performance measures for classification , 2009, Pattern Recognit. Lett..

[32]  Shu-Hsien Liao,et al.  Data mining techniques and applications - A decade review from 2000 to 2011 , 2012, Expert Syst. Appl..

[33]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[34]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[35]  Francisco J. Rodríguez,et al.  Tabu-enhanced iterated greedy algorithm: A case study in the quadratic multiple knapsack problem , 2014, Eur. J. Oper. Res..