论文信息 - A Re-evaluation of the Over-Searching Phenomenon in Inductive Rule Learning

A Re-evaluation of the Over-Searching Phenomenon in Inductive Rule Learning

Most commonly used inductive rule learning algorithms employ a hill-climbing search, whereas local pattern discovery algorithms employ exhaustive search. In this paper, we evaluate the spectrum of different search strategies to see whether separate-and-conquer rule learning algorithms are able to gain performance in terms of predictive accuracy or theory size by using more powerful search strategies like beam search or exhaustive search. Unlike previous results that demonstrated that rule learning algorithms suffer from oversearching, our work pays particular attention to the connection between the search heuristic and the search strategy, and we show that for some rule evaluation functions, complex search algorithms will consistently improve results without suffering from the over-searching phenomenon. In particular, we will see that this is typically the case for heuristics which perform bad in a hill-climbing search. We interpret this as evidence that commonly used rule learning heuristics mix two different aspects: a rule evaluation metric that measures the predictive quality of a rule, and a search heuristic that captures the potential of a candidate rule to be refined into highly predictive rule. For effective exhaustive search, these two aspects need to be clearly separated.

Johannes Fürnkranz | Frederik Janssen

[1] Tim Niblett,et al. Constructing Decision Trees in Noisy Domains , 1987, EWSL.

[2] Steven Salzberg,et al. Lookahead and Pathology in Decision Tree Induction , 1995, IJCAI.

[3] R. Mike Cameron-Jones,et al. Oversearching and Layered Search in Empirical Learning , 1995, IJCAI.

[4] Johannes Fürnkranz,et al. ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[5] Robert C. Holte,et al. Concept Learning and the Problem of Small Disjuncts , 1989, IJCAI.

[6] Robert C. Holte,et al. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[7] Geoffrey I. Webb. OPUS: An Efficient Admissible Algorithm for Unordered Search , 1995, J. Artif. Intell. Res..

[8] Peter A. Flach,et al. Rule Evaluation Measures: A Unifying View , 1999, ILP.

[9] Wynne Hsu,et al. Integrating Classification and Association Rule Mining , 1998, KDD.

[10] Peter Clark,et al. Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[11] JOHANNES FÜRNKRANZ,et al. Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.