论文信息 - A Bayesian Discretizer for Real-Valued Attributes

A Bayesian Discretizer for Real-Valued Attributes

Discretization of real-valued attributes into nominal intervals has been an important area for symbolic induction systems because many real world classification tasks involve both symbolic and numerical attributes. Among various supervised and unsupervised discretization methods, the information gain-based methods have been widely used and cited. This paper designs a new discretization method, called the Bayesian discretizer, and compares its performance with the information gain methods implemented in C4.5 and HCV (Version 2.0). Over the seven datasets tested, the Bayesian discretizer has the best results of four of them in terms of predictive accuracy.

Xindong Wu | Xindong Wu

[1] I. Bratko,et al. Learning decision rules in noisy domains , 1987 .

[2] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[3] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[4] Bernhard Pfahringer,et al. Compression-Based Discretization of Continuous Attributes , 1995, ICML.

[5] Andrew K. C. Wong,et al. Information Discovery through Hierarchical Maximum Entropy Discretization and Synthesis , 1991, Knowledge Discovery in Databases.

[6] Saso Dzeroski,et al. Inductive Logic Programming: Techniques and Applications , 1993 .

[7] Jason Catlett,et al. On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.