rsLDA: A Bayesian hierarchical model for relational learning

We introduce and evaluate a technique to tackle relational learning tasks combining a framework for mining relational queries with a hierarchical Bayesian model. We present the novel rsLDA algorithm that works as follows. It initially discovers a set of relevant features from the relational data useful to describe in a propositional way the examples. This corresponds to reformulate the problem from a relational representation space into an attribute-value form. Afterwards, given this new features space, a supervised version of the Latent Dirichlet Allocation model is applied in order to learn the probabilistic model. The performance of the proposed method when applied on two real-world datasets shows an improvement when compared to other methods.

[1]  Luc De Raedt,et al.  Integrating Naïve Bayes and FOIL , 2007, J. Mach. Learn. Res..

[2]  David Poole,et al.  Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[3]  Luc De Raedt,et al.  Feature Construction with Version Spaces for Biochemical Applications , 2001, ICML.

[4]  Maurice Bruynooghe,et al.  Logic programs with annotated disjunctions , 2004, NMR.

[5]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[6]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Ashwin Srinivasan,et al.  Warmr: a data mining tool for chemical data , 2001, J. Comput. Aided Mol. Des..

[9]  Ben Taskar,et al.  Relational Markov Networks , 2007 .

[10]  Thomas Lukasiewicz,et al.  Probabilistic Logic Programming , 1998, ECAI.

[11]  Luc De Raedt,et al.  Fast learning of relational kernels , 2010, Machine Learning.

[12]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[13]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[14]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[15]  Stefano Ferilli,et al.  Multi-Dimensional Relational Sequence Mining , 2008, Fundam. Informaticae.

[16]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[17]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .

[18]  Fabrizio Riguzzi,et al.  Applying the information bottleneck to statistical relational learning , 2011, Machine Learning.

[19]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[20]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Stefano Ferilli,et al.  Optimizing Probabilistic Models for Relational Sequence Learning , 2011, ISMIS.

[22]  Luc De Raedt,et al.  Constraint Based Mining of First Order Sequences in SeqLog , 2004, Database Support for Data Mining Applications.

[23]  Hannu Toivonen,et al.  Finding Frequent Substructures in Chemical Compounds , 1998, KDD.

[24]  Ashwin Srinivasan,et al.  Relating chemical activity to structure: An examination of ILP successes , 1995, New Generation Computing.

[25]  Ashwin Srinivasan,et al.  Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction , 1996, Artif. Intell..

[26]  Luc De Raedt,et al.  Towards Combining Inductive Logic Programming with Bayesian Networks , 2001, ILP.

[27]  Luc De Raedt,et al.  nFOIL: Integrating Naïve Bayes and FOIL , 2005, AAAI.

[28]  Nicola Fanizzi,et al.  An Exhaustive Matching Procedure for the Improvement of Learning Efficiency , 2003, ILP.