On the connection between the phase transition of the covering test and the learning success rate in ILP

Abstract It is well-known that heuristic search in ILP is prone to plateau phenomena. An explanation can be given after the work of Giordana and Saitta: the ILP covering test is NP-complete and therefore exhibits a sharp phase transition in its coverage probability. As the heuristic value of a hypothesis depends on the number of covered examples, the regions “yes” and “no” represent plateaus that need to be crossed during search without an informative heuristic value. Several subsequent works have extensively studied this finding by running several learning algorithms on a large set of artificially generated problems and argued that the occurrence of this phase transition dooms every learning algorithm to fail to identify the target concept. We note however that only generate-and-test learning algorithms have been applied and that this conclusion has to be qualified in the case of data-driven learning algorithms. Mostly building on the pioneering work of Winston on near-miss examples, we show that, on the same set of problems, a top-down data-driven strategy can cross any plateau if near-misses are supplied in the training set, whereas they do not change the plateau profile and do not guide a generate-and-test strategy. We conclude that the location of the target concept with respect to the phase transition alone is not a reliable indication of the learning problem difficulty as previously thought.

[1]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[2]  Luc De Raedt,et al.  Lookahead and Discretization in ILP , 1997, ILP.

[3]  Carlos A. Coello Coello,et al.  MICAI 2002: Advances in Artificial Intelligence , 2002, Lecture Notes in Computer Science.

[4]  Benjamin D. Smith,et al.  Incremental Non-Backtracking Focusing: A Polynomially Bounded Generalization Algorithm for Version Spaces , 1990, AAAI.

[5]  Stefan Wrobel,et al.  On the Stability of Example-Driven Learning Systems: A Case Study in Multirelational Learning , 2002, MICAI.

[6]  Érick Alphonse,et al.  Macro-Operators Revisited in Inductive Logic Programming , 2004, ILP.

[7]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[10]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[11]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[12]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[13]  Lorenza Saitta,et al.  Learning on the Phase Transition Edge , 2001, IJCAI.

[14]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[15]  Michèle Sebag,et al.  Relational Learning as Search in a Critical Region , 2003, J. Mach. Learn. Res..

[16]  Patrick Henry Winston,et al.  The psychology of computer vision , 1976, Pattern Recognit..

[17]  Ehud Shapiro,et al.  Algorithmic Program Debugging , 1983 .

[18]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[19]  Michael J. Pazzani,et al.  Relational Clichés: Constraining Induction During Relational Learning , 1991, ML.

[20]  David Haussler,et al.  Learning Conjunctive Concepts in Structural Domains , 1989, Machine Learning.

[21]  Peter C. Cheeseman,et al.  Where the Really Hard Problems Are , 1991, IJCAI.

[22]  Alex S. Taylor,et al.  Machine intelligence , 2009, CHI.

[23]  Céline Rouveirol,et al.  Extension of the Top-Down Data-Driven Strategy to ILP , 2007, ILP.

[24]  Richard E. Korf,et al.  Depth-First Iterative-Deepening: An Optimal Admissible Tree Search , 1985, Artif. Intell..

[25]  Krzysztof R. Apt,et al.  Logic Programming , 1990, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[26]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[27]  Lorenza Saitta,et al.  Phase Transitions in Relational Learning , 2000, Machine Learning.

[28]  Philip D. Laird Inductive Inference by Refinement , 1986, AAAI.

[29]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[30]  Johannes Fürnkranz,et al.  ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[31]  E. KorfRichard Depth-first iterative-deepening: an optimal admissible tree search , 1985 .

[32]  Raymond J. Mooney,et al.  Learning Relations by Pathfinding , 1992, AAAI.

[33]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[34]  Georg Gottlob,et al.  On the complexity of some inductive logic programming problems , 1997, New Generation Computing.

[35]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[36]  Michèle Sebag,et al.  Constraint-based Learning of Long Relational Concepts , 2002, ICML.

[37]  Michèle Sebag,et al.  Analyzing Relational Learning in the Phase Transition Framework , 2000, ICML.

[38]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[39]  Richard E. Korf,et al.  Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..

[40]  Frederick Hayes-Roth,et al.  Knowledge acquisition from structural descriptions , 1977, IJCAI 1977.

[41]  J. Ross Quinlan,et al.  Determinate Literals in Inductive Logic Programming , 1991, IJCAI.

[42]  ProgramsWilliam W. CohenAT Learnability of Restricted Logic Programs , 1993 .

[43]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.