Confirmation-Guided Discovery of First-Order Rules with Tertius

This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal best-first search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal either with individual-based representations by upgrading propositional representations to first-order, or with general logical rules. We describe a number of experiments demonstrating the feasibility and flexibility of our approach.

[1]  T. Wickens Multiway Contingency Tables Analysis for the Social Sciences , 1989 .

[2]  Peter A. Flach,et al.  Database Dependency Discovery: A Machine Learning Approach , 1999, AI Commun..

[3]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.

[4]  Luc De Raedt,et al.  Mining Association Rules in Multiple Relations , 1997, ILP.

[5]  Paul R. Cohen,et al.  Searching for Structure in Multiple Streams of Data , 1996, ICML.

[6]  Luc De Raedt,et al.  Multiple Predicate Learning in Two Inductive Logic Programming Settings , 1996, Log. J. IGPL.

[7]  Peter A. Flach,et al.  Strongly Typed Inductive Concept Learning , 1998, ILP.

[8]  S. H. MuggletonOxford Mutagenesis : ILP experiments in a non - , 1994 .

[9]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[10]  Steven A. Vere,et al.  Inductive learning of relational productions , 1977, SGAR.

[11]  Stefan Wrobel,et al.  An Algorithm for Multi-relational Discovery of Subgroups , 1997, PKDD.

[12]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[13]  Luc De Raedt,et al.  Logical Settings for Concept-Learning , 1997, Artif. Intell..

[14]  Thomas D. Wickens,et al.  Multiway contingency table analysis for the social sciences. , 1989 .

[15]  Hannu Toivonen,et al.  Discovery of frequent DATALOG patterns , 1999, Data Mining and Knowledge Discovery.

[16]  S. Muggleton,et al.  The role of background knowledge : using a problemfrom chemistry to examine the performance of anILP program , 1996 .

[17]  T. Wickens,et al.  Multiway Contingency Tables Analysis for the Social Sciences , 1992 .

[18]  Torbjørn S. Dahl Background Knowledge in the Tertius First Order Knowledge Discovery Tool , 1999 .

[19]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[20]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[21]  Saso Dzeroski,et al.  Integrating Explanatory and Descriptive Learning in ILP , 1997, IJCAI.

[22]  Luc De Raedt,et al.  Clausal Discovery , 1997, Machine Learning.

[23]  Nicolas Lachiche,et al.  1 BC : a First-Order Bayesian , 1999 .

[24]  Peter A. Flach Predicate Invention in Inductive Data Engineering , 1993, ECML.

[25]  Liviu Badea,et al.  Refinement Operators Can Be (Weakly) Perfect , 1999, ILP.