Classification Learning Using All Rules

The covering algorithm has been ubiquitous in the induction of classification rules. This approach to machine learning uses heuristic search that seeks to find a minimum number of rules that adequately explain the data. However, recent research has provided evidence that learning redundant classifiers can increase predictive accuracy. Learning all possible classifiers seems to be a plausible ultimate form of this notion of redundant classifiers. This paper presents an algorithm that in effect learns all classifiers. Preliminary investigation by Webb (1996b) suggested that a heuristic covering algorithm in general learns classification rules with higher predictive accuracy than those learned by this new approach. In this paper we present an extensive empirical comparison between the learning-all-rules algorithm and three varied established approaches to inductive learning, namely, a covering algorithm, an instance-based learner and a decision tree learner. Empirical evaluation provides strong evidence in support of learning-all-rules as a plausible approach to inductive learning.

[1]  David W. Aha,et al.  Special Issue on Lazy Learning , 1997 .

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  Editorial Lazy Learning , .

[4]  David Aha A study of instance-based algorithms for supervised learning tasks: mathematica:l , 1990 .

[5]  Geoffrey I. Webb OPUS: An Efficient Admissible Algorithm for Unordered Search , 1995, J. Artif. Intell. Res..

[6]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[7]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[8]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[9]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[10]  Richard Nock,et al.  On Learning Decision Committees , 1995, ICML.

[11]  Pat Langley,et al.  Improving Efficiency by Learning Intermediate Concepts , 1989, IJCAI.

[12]  I. Webb,et al.  A heuristic covering algorithm has higher predictive accuracy than learning all rules , 1996 .

[13]  Geoffrey I. Webb Further Experimental Evidence against the Utility of Occam's Razor , 1996, J. Artif. Intell. Res..

[14]  L. N. Kanal,et al.  Uncertainty in Artificial Intelligence 5 , 1990 .

[15]  Michael J. Pazzani,et al.  On learning multiple descriptions of a concept , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[16]  J. Rissanen Stochastic Complexity in Statistical Inquiry Theory , 1989 .

[17]  Pedro M. Domingos Rule Induction and Instance-Based Learning: A Unified Approach , 1995, IJCAI.

[18]  Ron Kohavi,et al.  Lazy Decision Trees , 1996, AAAI/IAAI, Vol. 1.

[19]  David W. Aha,et al.  A study of instance-based algorithms for supervised learning tasks: mathematical, empirical, and psychological evaluations , 1990 .

[20]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[21]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[22]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[23]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[24]  Chris Carter,et al.  Multiple decision trees , 2013, UAI.

[25]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[26]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[27]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..