A heuristic for learning decision trees and pruning them into classification rules

Let us consider a set of training examples described by continuous or symbolic attributes with categorical classes. In this paper we present a measure of the potential quality of a region of the attribute space to be represented as a rule condition to classify unseen cases. The aim is to take into account the distribution of the classes of the examples. The resulting measure, called impurity level, is inspired by a similar measure used in the instance-based algorithm IB3 for selecting suitable paradigmatic exemplars that will classify, in a nearest-neighbor context, future cases. The features of the impurity level are illustrated using a version of Quinlan's well-known C4.5 where the information-based heuristics are replaced by our measure. The experiments carried out to test the proposals indicate a very high accuracy reached with sets of classification rules as small as those found by RIPPER.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  L. Rendell A General Framework for Induction and a Study of Selective Induction , 1986, Machine Learning.

[3]  José Ranilla,et al.  F AN: Finding Accurate iNductions , 2002, Int. J. Hum. Comput. Stud..

[4]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[5]  Johannes Fürnkranz,et al.  Pruning Algorithms for Rule Learning , 1997, Machine Learning.

[6]  N. Lavra,et al.  Predictive Performance of Weighted Relative Accuracy , 2000 .

[7]  David W. Aha,et al.  Simplifying decision trees: A survey , 1997, The Knowledge Engineering Review.

[8]  Óscar Luaces Rodríguez Un sistema de aprendizaje de reglas explícitas mediante la generalización de instancias , 1999 .

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[11]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[12]  Juan José del Coz Velasco Bets. Sistema de aprendizaje basado en la selección de ejemplos paradigmáticos , 2000 .

[13]  Johannes Fürnkranz,et al.  Incremental Reduced Error Pruning , 1994, ICML.

[14]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[15]  Oscar Luaces,et al.  Inflating examples to obtain rules , 2003, Int. J. Intell. Syst..

[16]  Antonio Bahamonde Rionda,et al.  El nivel de impureza de una regla de clasificación aprendida a partir de ejemplos , 1997 .

[17]  Thomas G. Dietterich,et al.  An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms , 1995, Machine Learning.

[18]  José Ranilla Pastor Abanico: aprendizaje basado en la agrupación numérica en intervalos continuos , 1998 .

[19]  Pedro M. Domingos Unifying Instance-Based and Rule-Based Induction , 1996, Machine Learning.

[20]  Thomas G. Dietterich,et al.  Applying the Waek Learning Framework to Understand and Improve C4.5 , 1996, ICML.

[21]  Nils J. Nilsson,et al.  MLC++, A Machine Learning Library in C++. , 1995 .

[22]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[23]  José Ranilla Nivel de Impureza de una regla de clasificación aprendida a partir de ejemplos,El , 1998 .

[24]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[25]  Peter A. Flach,et al.  Predictive Performance of Weghted Relative Accuracy , 2000, PKDD.

[26]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[27]  José Ranilla,et al.  Autonomous Clustering for Machine Learning , 1999, IWANN.

[28]  Christian Borgelt,et al.  Concepts for Probabilistic and Possibilistic Induction of Decision Trees on Real World Data , 2004 .

[29]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[30]  David Aha A study of instance-based algorithms for supervised learning tasks: mathematica:l , 1990 .

[31]  William W. Cohen Efficient Pruning Methods for Separate-and-Conquer Rule Learning Systems , 1993, IJCAI.

[32]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[33]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[34]  José Ranilla,et al.  Self-Organizing Cases to Find Paradigms , 1999, IWANN.

[35]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[36]  Philip J. Stone,et al.  Experiments in induction , 1966 .

[37]  Steven L. Salzberg,et al.  Learning with Nested Generalized Exemplars , 1990 .

[38]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..