Three discretization methods for rule induction

We discuss problems associated with induction of decision rules from data with numerical attributes. Real‐life data frequently contain numerical attributes. Rule induction from numerical data requires an additional step called discretization. In this step numerical values are converted into intervals. Most existing discretization methods are used before rule induction, as a part of data preprocessing. Some methods discretize numerical attributes while learning decision rules. We compare the classification accuracy of a discretization method based on conditional entropy, applied before rule induction, with two newly proposed methods, incorporated directly into the rule induction algorithm LEM2, where discretization and rule induction are performed at the same time. In all three approaches the same system is used for classification of new, unseen data. As a result, we conclude that an error rate for all three methods does not show significant difference, however, rules induced by the two new methods are simpler and stronger. © 2001 John Wiley & Sons, Inc.

[1]  John H. Holland,et al.  Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[2]  D.E. Goldberg,et al.  Classifier Systems and Genetic Algorithms , 1989, Artif. Intell..

[3]  Jaime G. Carbonell,et al.  Machine learning: paradigms and methods , 1990 .

[4]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[5]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[6]  Jerzy W. Grzymala-Busse,et al.  LERS-A System for Learning from Examples Based on Rough Sets , 1992, Intelligent Decision Support.

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[9]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[10]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[11]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[12]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[13]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[14]  Robert Susmaga,et al.  Analyzing Discretizations of Continuous Attributes Given a Monotonic Discrimination Function , 1997, Intell. Data Anal..

[15]  Tom M. Mitchell,et al.  Machine Learning and Data Mining , 2012 .