An evolution strategies approach to the simultaneous discretization of numeric attributes in data mining

Many data mining and machine learning algorithms require databases in which objects are described by discrete attributes. However, it is very common that the attributes are in the ratio or interval scales. In order to apply these algorithms, the original attributes must be transformed into the nominal or ordinal scale via discretization. An appropriate transformation is crucial because of the large influence on the results obtained from data mining procedures. This paper presents a hybrid technique for the simultaneous supervised discretization of continuous attributes, based on evolutionary algorithms, in particular, evolution strategies (ES), which is combined with rough set theory and information theory. The purpose is to construct a discretization scheme for all continuous attributes simultaneously (i.e. global) in such a way that class predictability is maximized w.r.t the discrete classes generated for the predictor variables. The ES approach is applied to 17 public data sets and the results are compared with classical discretization methods. ES-based discretization not only outperforms these methods, but leads to much simpler data models and is able to discover irrelevant attributes. These features are not present in classical discretization techniques.

[1]  Xin Yao,et al.  Fast Evolution Strategies , 1997, Evolutionary Programming.

[2]  Chang-Hwan Lee,et al.  Discretization of Continuous-Valued Attributes for Classification Learning , 1997 .

[3]  Ron Kohavi,et al.  Lazy Decision Trees , 1996, AAAI/IAAI, Vol. 1.

[4]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[5]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[6]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[7]  Hans-Paul Schwefel,et al.  Numerical Optimization of Computer Models , 1982 .

[8]  D. Fogel Evolutionary algorithms in theory and practice , 1997, Complex..

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[11]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[13]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[14]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[15]  J. Ross Quinlan Learning First-Order Definitions of Functions , 1996, J. Artif. Intell. Res..

[16]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[17]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[18]  Solange Oliveira Rezende,et al.  Transforming a Regression Problem into a Classification Problem using Hybrid Discretization , 2000, Computación y Sistemas.

[19]  M. Pazzani,et al.  The Utility of Knowledge in Inductive Learning , 1992, Machine Learning.

[20]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[21]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[22]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.