Multi-interval Discretization Methods for Decision Tree Learning

Properly addressing the discretization process of continuos valued features is an important problem during decision tree learning. This paper describes four multi-interval discretization methods for induction of decision trees used in dynamic fashion. We compare two known discretization methods to two new methods proposed in this paper based on a histogram based method and a neural net based method (LVQ). We compare them according to accuracy of the resulting decision tree and to compactness of the tree. For our comparison we used three data bases, IRIS domain, satellite domain and OHS domain (ovariel hyper stimulation).

[1]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[2]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[3]  King-Sun Fu,et al.  Automatic classification of cervical cells using a binary tree classifier , 1983, Pattern Recognition.

[4]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[5]  Jorma Laaksonen,et al.  LVQ_PAK: The Learning Vector Quantization Program Package , 1996 .

[6]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[9]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[10]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[11]  Ilkka Moring,et al.  Software tool for developing algorithms for surface inspection systems , 1994, Other Conferences.

[12]  Petra Perner,et al.  Knowledge Acquisition by Symbolic Decision Tree Induction for Interpretation of Digital Images in Radiology , 1996, SSPR.

[13]  T. P. Huber,et al.  Initial analysis of Landsat TM data for elk habitat mapping , 1990 .

[14]  David A. Landgrebe,et al.  The decision tree approach to classification , 1975 .

[15]  Chang-Hwan Lee,et al.  Discretization of Continuous-Valued Attributes for Classification Learning , 1997 .