论文信息 - Discretization of Continuous Attributes for Learning Classification Rules

Discretization of Continuous Attributes for Learning Classification Rules

We present a comparison of three entropy-based discretization methods in a context of learning classification rules. We compare the binary recursive discretization with a stopping criterion based on the Minimum Description Length Principle (MDLP)[3], a nonrecursive method which simply chooses a number of cut-points with the highest entropy gains, and a non-recursive method that selects cut-points according to both information entropy and distribution of potential cut-points over the instance space. Our empirical results show that the third method gives the best predictive performance among the three methods tested.

Nick Cercone | Aijun An

[1] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[2] Nick Cercone,et al. ELEM2: A Learning System for More Accurate Classifications , 1998, Canadian Conference on AI.

[3] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[4] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[5] Keki B. Irani,et al. Multi-interval discretization of continuos attributes as pre-processing for classi cation learning , 1993, IJCAI 1993.