Condensed Representations for Inductive Logic Programming

When mining frequent Datalog queries, many queries will cover the same examples; i.e., they will be equivalent and hence, redundant. The equivalences can be due to the data set or to the regularities specified in the background theory. To avoid the generation of redundant clauses, we introduce various types of condensed representations. More specifically, we introduce δ-free and closed clauses, that are defined w.r.t. the data set, and semantically free and closed clauses, that take into account a logical background theory. A novel algorithm that employs these representations is also presented and experimentally evaluated on a number of benchmark problems in inductive logic programming.

[1]  Donato Malerba,et al.  Discovering Associations between Spatial Objects: An ILP Application , 2001, ILP.

[2]  Gerd Stumme,et al.  Iceberg Query Lattices for Datalog , 2004, LWA.

[3]  Wray L. Buntine Generalized Subsumption and Its Applications to Induction and Redundancy , 1986, Artif. Intell..

[4]  Joost N. Kok,et al.  Efficient Frequent Query Discovery in FARMER , 2003, PKDD.

[5]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[6]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[7]  Ashwin Srinivasan,et al.  Query Transformations for Improving the Efficiency of ILP Systems , 2003, J. Mach. Learn. Res..

[8]  Bart Demoen,et al.  Improving the Efficiency of Inductive Logic Programming Through the Use of Query Packs , 2011, J. Artif. Intell. Res..

[9]  Joost N. Kok,et al.  Faster Association Rules for Multiple Relations , 2001, IJCAI.

[10]  Ashwin Srinivasan,et al.  An Assessment of ILP-Assisted Models for Toxicology and the PTE-3 Experiment , 1999, ILP.

[11]  Lorenza Saitta,et al.  Biasing Induction by Using a Domain Theory: An Experimental Evaluation , 1990, European Conference on Artificial Intelligence.

[12]  Irene Weber Discovery of First-Order Regularities in a Relational Database Using Offline Candidate Determination , 1997, ILP.

[13]  Nicola Fanizzi,et al.  Ideal Theory Refinement under Object Identity , 2000, ICML.

[14]  Ashwin Srinivasan,et al.  Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction , 1996, Artif. Intell..

[15]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[16]  Takashi Washio,et al.  Complete Mining of Frequent Patterns from Graphs: Mining Graph Data , 2003, Machine Learning.

[17]  Luc De Raedt,et al.  Mining Association Rules in Multiple Relations , 1997, ILP.

[18]  Luc De Raedt,et al.  Clausal Discovery , 1997, Machine Learning.

[19]  Nicolas Helft,et al.  Induction as Nonmonotonic Inference , 1989, KR.

[20]  Hannu Toivonen,et al.  Discovery of frequent DATALOG patterns , 1999, Data Mining and Knowledge Discovery.

[21]  Jean-François Boulicaut,et al.  Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries , 2004, Data Mining and Knowledge Discovery.

[22]  Hendrik Blockeel,et al.  Query Optimization in Inductive Logic Programming by Reordering Literals , 2003, ILP.