Propositionalization approaches to relational data mining

This chapter surveys methods that transform a relational representation of a learning problem into a propositional (feature-based, attribute-value) representation. This kind of representation change is known as propositionalization. Taking such an approach, feature construction can be decoupled from model construction. It has been shown that in many relational data mining applications this can be done without loss of predictive performance. After reviewing both general-purpose and domain-dependent propositionalization approaches from the literature, an extension to the LINUS propositionalization method that overcomes the system's earlier inability to deal with non-determinate local variables is described.

[1]  Vladimir Vapnik,et al.  Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics) , 1982 .

[2]  G. Klopman Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules , 1985 .

[3]  Ivan Bratko,et al.  ASSISTANT 86: A Knowledge-Elicitation Tool for Sophisticated Users , 1987, EWSL.

[4]  Ivan Bratko,et al.  KARDIO - a study in deep and qualitative knowledge for expert systems , 1989 .

[5]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[6]  Saso Dzeroski,et al.  Learning Nonrecursive Definitions of Relations with LINUS , 1991, EWSL.

[7]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[8]  Alberto L. Sangiovanni-Vincentelli,et al.  Constructive Induction Using a Non-Greedy Strategy for Feature Selection , 1992, ML.

[9]  Raymond J. Mooney,et al.  Learning Relations by Pathfinding , 1992, AAAI.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  G. Klopman MULTICASE 1. A Hierarchical Computer Automated Structure Evaluation Program , 1992 .

[12]  Lawrence B. Holder,et al.  Substructure Discovery Using Minimum Description Length and Background Knowledge , 1993, J. Artif. Intell. Res..

[13]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[14]  William W. Cohen Pac-Learning Nondeterminate Clauses , 1994, AAAI.

[15]  Dieter Fensel,et al.  Are Substitutions the Better Examples? Learning complete Set of Clauses with Frog. , 1995 .

[16]  Ashwin Srinivasan,et al.  Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction , 1996, Artif. Intell..

[17]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  William W. Cohen Learning Trees and Rules with Set-Valued Features , 1996, AAAI/IAAI, Vol. 1.

[19]  Fumio Mizoguchi,et al.  Learning Rules That Classify Ocular Fundus Images for Glaucoma Diagnosis , 1996, Inductive Logic Programming Workshop.

[20]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[21]  Fritz Wysotzki,et al.  Relational Learning with Decision Trees , 1996, ECAI.

[22]  Stefan Kramer,et al.  Structural Regression Trees , 1996, AAAI/IAAI, Vol. 1.

[23]  Ron Kohavi,et al.  Data mining using /spl Mscr//spl Lscr//spl Cscr/++ a machine learning library in C++ , 1996, Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence.

[24]  De Raedt,et al.  Advances in Inductive Logic Programming , 1996 .

[25]  Jean-Gabriel Ganascia,et al.  Representation Changes for Efficient Learning in Structural Domains , 1996, ICML.

[26]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[27]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[28]  Michael Wooldridge,et al.  Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, IJCAI 97, Nagoya, Japan, August 23-29, 1997, 2 Volumes , 1997, IJCAI.

[29]  Michèle Sebag,et al.  Tractable Induction and Classification in First Order Logic Via Stochastic Matching , 1997, IJCAI.

[30]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[31]  Luc De Raedt,et al.  Logical Settings for Concept-Learning , 1997, Artif. Intell..

[32]  Stefan Wrobel,et al.  An Algorithm for Multi-relational Discovery of Subgroups , 1997, PKDD.

[33]  Stefan Kramer,et al.  Stochastic Propositionalization of Non-determinate Background Knowledge , 1998, ILP.

[34]  Luc De Raedt,et al.  Attribute-Value Learning Versus Inductive Logic Programming: The Missing Links (Extended Abstract) , 1998, ILP.

[35]  Ashwin Srinivasan,et al.  Biochemical Knowledge Discovery Using Inductive Logic Programming , 1998, Discovery Science.

[36]  Nada Lavrac,et al.  A Relevancy Filter for Constructive Induction , 1998, IEEE Intell. Syst..

[37]  Peter A. Flach,et al.  Strongly Typed Inductive Concept Learning , 1998, ILP.

[38]  Jean-Gabriel Ganascia,et al.  Learning Structurally Indeterminate Clauses , 1998, ILP.

[39]  Saso Dzeroski,et al.  Experiments in Predicting Biodegradability , 1999, ILP.

[40]  Peter A. Flach,et al.  IBC: A First-Order Bayesian Classifier , 1999, ILP.

[41]  Ashwin Srinivasan,et al.  An assessment of submissions made to the Predictive Toxicology Evaluation Challenge , 1999, IJCAI.

[42]  Peter A. Flach Knowledge Representation for Inductive Learning , 1999, ESCQARU.

[43]  Stefan Kramer,et al.  Bottom-Up Propositionalization , 2000, ILP Work-in-progress reports.

[44]  Yann Chevaleyre Noise-Tolerant Rule induction from Multi-Instance data , 2001 .

[45]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[46]  Céline Rouveirol,et al.  Lazy Propositionalisation for Relational Learning , 2000, ECAI.

[47]  Peter D. Turney Low Size-Complexity Inductive Logic Programming: The East-West Challenge Considered as a Problem in Cost-Sensitive Classification , 2002, ArXiv.