Scaling Up Inductive Logic Programming by Learning from Interpretations

When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently.Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting).As a case study, we present two alternative implementations of the ILP system TILDE (Top-down Induction of Logical DEcision trees): TILDEclassic, which loads all data in main memory, and TILDELDS, which loads the examples one by one. We experimentally compare the implementations, showing TILDELDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.

[1]  Saso Dzeroski,et al.  PAC-learnability of determinate logic programs , 1992, COLT '92.

[2]  Stephen Muggleton Optimal Layered Learning: A PAC Approach to Incremental Sampling , 1993, ALT.

[3]  James Cussens Part-of-Speech Tagging Using Progol , 1997, ILP.

[4]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[5]  Luc De Raedt,et al.  Clausal Discovery , 1997, Machine Learning.

[6]  Ashwin Srinivasan,et al.  A Study of Two Sampling Methods for Analyzing Large Datasets with ILP , 1999, Data Mining and Knowledge Discovery.

[7]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[8]  Luc De Raedt,et al.  Three Companions for First Order Data Mining , 1998 .

[9]  Larry A. Rendell,et al.  Learning Structural Decision Trees from Examples , 1991, IJCAI.

[10]  Luc De Raedt,et al.  Attribute-Value Learning Versus Inductive Logic Programming: The Missing Links (Extended Abstract) , 1998, ILP.

[11]  Luc De Raedt,et al.  Top-Down Induction of Clustering Trees , 1998, ICML.

[12]  Johannes Fürnkranz,et al.  Dimensionality Reduction in ILP: A Call to Arms , 1997 .

[13]  Luc De Raedt,et al.  Inductive Constraint Logic , 1995, ALT.

[14]  Luc De Raedt,et al.  Using ILP-Systems for Verification and Validation of Multi-agent Systems , 1998, ILP.

[15]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[16]  Ivan Bratko,et al.  Prolog Programming for Artificial Intelligence , 1986 .

[17]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[18]  William W. Cohen Pac-learning Recursive Logic Programs: Negative Results , 1994, J. Artif. Intell. Res..

[19]  Stefan Wrobel,et al.  Extensibility in Data Mining Systems , 1996, KDD.

[20]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[21]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[22]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[23]  Luc De Raedt,et al.  Logical Settings for Concept-Learning , 1997, Artif. Intell..

[24]  Saso Dzeroski,et al.  Proceedings of the 7th International Workshop on Inductive Logic Programming , 1997 .

[25]  Michèle Sebag,et al.  A Stochastic Simple Similarity , 1998, ILP.

[26]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[27]  Katharina Morik,et al.  A Multistrategy Approach to Relational Knowledge Discovery in Databases , 1997, Machine Learning.

[28]  Hiroaki Kitano,et al.  The RoboCup Synthetic Agent Challenge 97 , 1997, IJCAI.

[29]  Luc De Raedt,et al.  On Multi-class Problems and Discretization in Inductive Logic Programming , 1997, ISMIS.

[30]  Ivan Bratko,et al.  Applications of inductive logic programming , 1995, CACM.

[31]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[32]  Stefan Kramer,et al.  Structural Regression Trees , 1996, AAAI/IAAI, Vol. 1.

[33]  Ashwin Srinivasan,et al.  Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction , 1996, Artif. Intell..

[34]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[35]  Johannes Fürnkranz,et al.  Noise-Tolerant Windowing , 1997, IJCAI.

[36]  Luc De Raedt,et al.  Lookahead and Discretization in ILP , 1997, ILP.

[37]  De Raedt,et al.  Advances in Inductive Logic Programming , 1996 .

[38]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[39]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[40]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[41]  Luc De Raedt,et al.  First-Order jk-Clausal Theories are PAC-Learnable , 1994, Artif. Intell..

[42]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[43]  Luc De Raedt,et al.  Mining Association Rules in Multiple Relations , 1997, ILP.

[44]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[45]  J. Davenport Editor , 1960 .