论文信息 - Finding a short and accurate decision rule in disjunctive normal form by exhaustive search

Finding a short and accurate decision rule in disjunctive normal form by exhaustive search

Greedy approaches suffer from a restricted search space which could lead to suboptimal classifiers in terms of performance and classifier size. This study discusses exhaustive search as an alternative to greedy search for learning short and accurate decision rules. The Exhaustive Procedure for LOgic-Rule Extraction (EXPLORE) algorithm is presented, to induce decision rules in disjunctive normal form (DNF) in a systematic and efficient manner. We propose a method based on subsumption to reduce the number of values considered for instantiation in the literals, by taking into account the relational operator without loss of performance. Furthermore, we describe a branch-and-bound approach that makes optimal use of user-defined performance constraints. To improve the generalizability we use a validation set to determine the optimal length of the DNF rule. The performance and size of the DNF rules induced by EXPLORE are compared to those of eight well-known rule learners. Our results show that an exhaustive approach to rule learning in DNF results in significantly smaller classifiers than those of the other rule learners, while securing comparable or even better performance. Clearly, exhaustive search is computer-intensive and may not always be feasible. Nevertheless, based on this study, we believe that exhaustive search should be considered an alternative for greedy search in many problems.

Jan A. Kors | Peter R. Rijnbeek | J. Kors | P. Rijnbeek

[1] Antoine Zoghbiu,et al. Fast Algorithms for Generating Integer Partitions , 1994 .

[2] Richard B. Segal. Machine learning as massive search , 1998 .

[3] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[4] Pedro M. Domingos. The RISE system: conquering without separating , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[5] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[6] Guido Dedene,et al. Cost-sensitive learning and decision making revisited , 2005, Eur. J. Oper. Res..

[7] Luc De Raedt,et al. An experimental evaluation of simplicity in rule learning , 2008, Artif. Intell..

[8] Peter Clark,et al. The CN2 Induction Algorithm , 1989, Machine Learning.

[9] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[10] Roberto J. Bayardo. Brute-Force Mining of High-Confidence Classification Rules , 1997, KDD.

[11] Ivan Stojmenovic,et al. Fast algorithms for genegrating integer partitions , 1998, Int. J. Comput. Math..

[12] S. Finch. Integer partitions , 2021 .

[13] R. Mike Cameron-Jones,et al. Oversearching and Layered Search in Empirical Learning , 1995, IJCAI.

[14] Ian H. Witten,et al. Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[15] Jan A. Kors,et al. Induction of decision rules that fulfil user-specified performance requirements , 1997, Pattern Recognit. Lett..

[16] Robert C. Holte,et al. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.