Approximate Relational Reasoning by Stochastic Propositionalization

For many real-world applications it is important to choose the right representation language. While the setting of First Order Logic (FOL) is the most suitable one to model the multi-relational data of real and complex domains, on the other hand it puts the question of the computational complexity of the knowledge induction process. A way of tackling the complexity of such real domains, in which a lot of relationships are required to model the objects involved, is to use a method that reformulates a multi-relational learning task into an attribute-value one. In this chapter we present an approximate reasoning method able to keep low the complexity of a relational problem by using a stochastic inference procedure. The complexity of the relational language is decreased by means of a propositionalization technique, while the NP-completeness of the deduction is tackled using an approximate query evaluation. The proposed approximate reasoning technique has been used to solve the problem of relational rule induction as well as the task of relational clustering. An anytime algorithm has been used for the induction, implemented by a population based method, able to efficiently extract knowledge from relational data, while the clustering task, both unsupervised and supervised, has been solved using a Partition Around Medoid (PAM) clustering algorithm. The validity of the proposed techniques has been proved making an empirical evaluation on real-world datasets.

[1]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[2]  Shan-Hwei Nienhuys-Cheng Distances and Limits on Herbrand Interpretations , 1998, ILP.

[3]  Céline Rouveirol,et al.  Lazy Propositionalisation for Relational Learning , 2000, ECAI.

[4]  Michèle Sebag,et al.  Distance Induction in First Order Logic , 1997, ILP.

[5]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[6]  Peter A. Flach,et al.  RSD: Relational Subgroup Discovery through First-Order Feature Construction , 2002, ILP.

[7]  Hiromichi Fujisawa,et al.  Machine Learning in Document Analysis and Recognition , 2008, Studies in Computational Intelligence.

[8]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[9]  Marco Botta,et al.  An Experimental Study of Phase Transitions in Matching , 1999, IJCAI.

[10]  Saso Dzeroski,et al.  Learning Nonrecursive Definitions of Relations with LINUS , 1991, EWSL.

[11]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  Luc De Raedt,et al.  Top-Down Induction of Clustering Trees , 1998, ICML.

[14]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[15]  Luc De Raedt,et al.  Attribute-Value Learning Versus Inductive Logic Programming: The Missing Links (Extended Abstract) , 1998, ILP.

[16]  Stefano Ferilli,et al.  Machine Learning for Digital Document Processing: from Layout Analysis to Metadata Extraction , 2008, Machine Learning in Document Analysis and Recognition.

[17]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[18]  Gilles Bisson,et al.  Learning in FOL with a Similarity Measure , 1992, AAAI.

[19]  Christoph F. Eick,et al.  Supervised clustering - algorithms and benefits , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[20]  Michèle Sebag,et al.  Tractable Induction and Classification in First Order Logic Via Stochastic Matching , 1997, IJCAI.

[21]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[22]  Stefano Ferilli,et al.  Stochastic Propositionalization for Efficient Multi-relational Learning , 2008, ISMIS.

[23]  Bernhard Pfahringer,et al.  Clustering Relational Data Based on Randomized Propositionalization , 2007, ILP.

[24]  Stefan Wrobel,et al.  Term Comparisons in First-Order Similarity Measures , 1998, ILP.

[25]  Shlomo Zilberstein,et al.  Approximate Reasoning Using Anytime Algorithms , 1995 .

[26]  Jeff Z. Pan,et al.  An Argument-Based Approach to Using Multiple Ontologies , 2009, SUM.

[27]  Jean-Gabriel Ganascia,et al.  Representation Changes for Efficient Learning in Structural Domains , 1996, ICML.

[28]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[29]  Stefano Ferilli,et al.  Approximate Reasoning for Efficient Anytime Induction from Relational Knowledge Bases , 2008, SUM.

[30]  Mark S. Boddy,et al.  Deliberation Scheduling for Problem Solving in Time-Constrained Environments , 1994, Artif. Intell..

[31]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[32]  Stan Matwin,et al.  A Dynamic Approach to Dimensionality Reduction in Relational Learning , 2002, ISMIS.

[33]  Yves Kodratoff Proceedings of the European Working Session on Machine Learning , 1991 .

[34]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[35]  Stefan Kramer,et al.  Stochastic Propositionalization of Non-determinate Background Knowledge , 1998, ILP.

[36]  Peter A. Flach,et al.  Comparative Evaluation of Approaches to Propositionalization , 2003, ILP.

[37]  Leon G. Higley,et al.  Forensic Entomology: An Introduction , 2009 .

[38]  Ivan Bratko,et al.  Prolog Programming for Artificial Intelligence , 1986 .