Learning Rules from Multiple Instance Data: Issues and Algorithms

In a multiple-instance representation, each learning example is represented by a “bag” of fixed-length “feature vectors”. Such a representation, lying somewhere between propositional and first-order representation, offers a tradeoff between the two. This paper proposes a generic extension to propositional rule learners to handle multiple-instance data. It describes NAIVE-RIPPERMI, an implementation of this extension on the rule learning algorithm RIPPER. It then explains several pitfalls encountered by this naive extension during induction. It goes on to describe algorithmic modifications and a new multipleinstance coverage measure which are shown to avoid these pitfalls. Experimental results show the benefits of this approach for solving propositionalized relational problems in terms of speed and accuracy. keywords: Multiple-instance learning problem, rule learning, propositionalization, relational learning, mutagenesis learning task

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  Céline Rouveirol,et al.  Selective Propositionalization for Relational Learning , 1999, PKDD.

[3]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[4]  Aravind Srinivasan,et al.  Approximating hyper-rectangles: learning and pseudo-random sets , 1997, STOC '97.

[5]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[6]  Francesco Bergadano,et al.  A Knowledge Intensive Approach to Concept Induction , 1988, ML Workshop.

[7]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[8]  Luc De Raedt,et al.  Attribute-Value Learning Versus Inductive Logic Programming: The Missing Links (Extended Abstract) , 1998, ILP.

[9]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[10]  Peter Auer,et al.  On Learning From Multi-Instance Examples: Empirical Evaluation of a Theoretical Approach , 1997, ICML.

[11]  Yann Chevaleyre,et al.  A Framework for Learning Rules from Multiple Instance Data , 2001, ECML.

[12]  Saso Dzeroski,et al.  Learning Nonrecursive Definitions of Relations with LINUS , 1991, EWSL.

[13]  Michèle Sebag,et al.  Tractable Induction and Classification in First Order Logic Via Stochastic Matching , 1997, IJCAI.

[14]  Jean-Gabriel Ganascia,et al.  Learning Structurally Indeterminate Clauses , 1998, ILP.

[15]  Michèle Sebag,et al.  Analyzing Relational Learning in the Phase Transition Framework , 2000, ICML.