Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning

Since most real-world applications of classification learning involve continuous-valued attributes, properly addressing the discretization process is an important problem. This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuous-valued attribute into multiple intervals.

[1]  Fazlollah M. Reza,et al.  Introduction to Information Theory , 2004, Lecture Notes in Electrical Engineering.

[2]  Philip M. Lewis,et al.  The characteristic selection problem in recognition systems , 1962, IRE Trans. Inf. Theory.

[3]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[4]  P. Sander Decision and estimation theory , 1980 .

[5]  Richard W. Hamming,et al.  Coding and Information Theory , 1980 .

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  Jie Cheng,et al.  Improved Decision Trees: A Generalized Version of ID3 , 1988, ML.

[8]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[9]  Edward J. Delp,et al.  An iterative growing and pruning algorithm for classification tree design , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.

[10]  J. R. Quinlan Probabilistic decision trees , 1990 .

[11]  Jie Cheng,et al.  Application of machine learning techniques to semiconductor manufacturing , 1990, Defense, Security, and Sensing.

[12]  Usama M. Fayyad,et al.  What Should Be Minimized in a Decision Tree? , 1990, AAAI.

[13]  Usama M. Fayyad,et al.  The Attribute Selection Problem in Decision Tree Generation , 1992, AAAI.

[14]  U. Fayyad,et al.  On the handling of continuous-valued attributes in decision tree generation , 2004, Machine Learning.

[15]  U. Fayyad On the induction of decision trees for multiple concept learning , 1991 .