Towards Multistrategic Statistical Relational Learning

Statistical Relational Learning (SRL) is a growing field in Machine Learning that aims at the integration of logic-based learning approaches with probabilistic graphical models. Markov Logic Networks (MLNs) are one of the state-of-the-art SRL models that combine first-order logic and Markov networks (MNs) by attaching weights to first-order formulas and viewing these as templates for features of MNs. Learning models in SRL consists in learning the structure (logical clauses in MLNs) and the parameters (weights for each clause in MLNs). Structure learning of MLNs is performed by maximizing a likelihood function (or a function thereof) over relational databases and MLNs have been successfully applied to problems in relational and uncertain domains. However, most complex domains are characterized by incomplete data. Until now SRL models have mostly used Expectation-Maximization (EM) for learning statistical parameters under missing values. Multistrategic learning in the relational setting has been a successful approach to dealing with complex problems where multiple inference mechanisms can help solve different subproblems. Abduction is an inference strategy that has been proven useful for completing missing values in observations. In this paper we propose two frameworks for integrating abduction in SRL models. The first tightly integrates logical abduction with structure and parameter learning of MLNs in a single step. During structure search guided by conditional likelihood, clause evaluation is performed by first trying to logically abduce missing values in the data and then by learning optimal pseudo-likelihood parameters using the completed data. The second approach integrates abduction with Structural EM of [17] by performing logical abductive inference in the E-step and then by trying to maximize parameters in the M-step.

[1]  David Poole,et al.  Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[2]  Nils J. Nilsson,et al.  Probabilistic Logic * , 2022 .

[3]  Andrew McCallum,et al.  Efficiently Inducing Features of Conditional Random Fields , 2002, UAI.

[4]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  James Cussens,et al.  Parameter Estimation in Stochastic Logic Programs , 2001, Machine Learning.

[6]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[7]  Alon Y. Halevy,et al.  P-CLASSIC: A Tractable Probablistic Description Logic , 1997, AAAI/IAAI.

[8]  Keith L. Clark,et al.  Negation as Failure , 1987, Logic and Data Bases.

[9]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[10]  Pedro M. Domingos,et al.  Efficient Weight Learning for Markov Logic Networks , 2007, PKDD.

[11]  Robert A. Kowalski,et al.  Abduction Compared with Negation by Failure , 1989, ICLP.

[12]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[13]  Pedro M. Domingos,et al.  A General Method for Reducing the Complexity of Relational Inference and its Application to MCMC , 2008, AAAI.

[14]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[15]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[16]  Evelina Lamma,et al.  Cooperation of Abduction and Induction in Logic Programming , 2000 .

[17]  F. Glover,et al.  Handbook of Metaheuristics , 2019, International Series in Operations Research & Management Science.

[18]  David Poole,et al.  A Logical Framework for Default Reasoning , 1988, Artif. Intell..

[20]  Raymond J. Mooney,et al.  Bottom-up learning of Markov logic network structure , 2007, ICML '07.

[21]  Joost N. Kok,et al.  Knowledge Discovery in Databases: PKDD 2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007, Proceedings , 2007, PKDD.

[22]  Stefano Ferilli,et al.  Discriminative Structure Learning of Markov Logic Networks , 2008, ILP.

[23]  Stefano Ferilli,et al.  Structure Learning of Markov Logic Networks through Iterated Local Search , 2008, ECAI.

[24]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[25]  Ryszard S. Michalski,et al.  Inferential Theory of Learning: Developing Foundations for Multistrategy Learning , 1992 .

[26]  De Raedt,et al.  Advances in Inductive Logic Programming , 1996 .

[27]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[28]  Chad Cumby Dan Roth,et al.  Feature Extraction Languages for Propositionalized Relational Learning , 2003 .

[29]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[30]  S. Muggleton Stochastic Logic Programs , 1996 .

[31]  Peter Haddawy,et al.  Answering Queries from Context-Sensitive Probabilistic Knowledge Bases , 1997, Theor. Comput. Sci..

[32]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[33]  Stuart J. Russell,et al.  Approximate inference for first-order probabilistic languages , 2001, IJCAI.

[34]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[35]  Raymond Reiter,et al.  A Logic for Default Reasoning , 1987, Artif. Intell..

[36]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .

[37]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[38]  Taisuke Sato,et al.  A Viterbi-like algorithm and EM learning for statistical abduction , 2000 .

[39]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[40]  Pedro M. Domingos,et al.  Markov Logic in Infinite Domains , 2007, UAI.

[41]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[42]  James Cussens,et al.  CLP(BN): Constraint Logic Programming for Probabilistic Knowledge , 2002, Probabilistic Inductive Logic Programming.

[43]  Pedro M. Domingos,et al.  Sound and Efficient Inference with Probabilistic and Deterministic Dependencies , 2006, AAAI.

[44]  Ehud Shapiro,et al.  Algorithmic Program Debugging , 1983 .

[45]  Pedro M. Domingos,et al.  Discriminative Training of Markov Logic Networks , 2005, AAAI.

[46]  Michael Wooldridge,et al.  Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, IJCAI 97, Nagoya, Japan, August 23-29, 1997, 2 Volumes , 1997, IJCAI.

[47]  Shan-Hwei Nienhuys-Cheng,et al.  Foundations of Inductive Logic Programming , 1997, Lecture Notes in Computer Science.

[48]  K. J. Evans Representing and Reasoning with Probabilistic Knowledge , 1993 .

[49]  Robert P. Goldman,et al.  From knowledge bases to decision models , 1992, The Knowledge Engineering Review.

[50]  A. Aliseda Abductive and Inductive Reasoning: Essays on their Relation and Integration , 2000 .

[51]  Luc De Raedt,et al.  Probabilistic Inductive Logic Programming - Theory and Applications , 2008, Probabilistic Inductive Logic Programming.

[52]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[53]  Paolo Mancarella,et al.  Abductive Logic Programming , 1992, LPNMR.

[54]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[55]  Andreas Arvanitis,et al.  Abduction with Stochastic Logic Programs based on a Possible Worlds Semantics , 2006 .

[56]  Luc De Raedt,et al.  Logical Settings for Concept-Learning , 1997, Artif. Intell..

[57]  Johann Eder,et al.  Logic and Databases , 1992, Advanced Topics in Artificial Intelligence.

[58]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[59]  Luc De Raedt,et al.  Clausal Discovery , 1997, Machine Learning.

[60]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[61]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.

[62]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[63]  Luc De Raedt,et al.  Towards Combining Inductive Logic Programming with Bayesian Networks , 2001, ILP.

[64]  Helena Ramalhinho Dias Lourenço,et al.  Iterated Local Search , 2001, Handbook of Metaheuristics.

[65]  Luc De Raedt,et al.  Integrating Naïve Bayes and FOIL , 2007, J. Mach. Learn. Res..

[66]  Fabrizio Riguzzi,et al.  Learning with Abduction , 1997, ILP.

[67]  Raymond J. Mooney,et al.  Discriminative structure and parameter learning for Markov logic networks , 2008, ICML '08.

[68]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[69]  Nicola Fanizzi,et al.  Multistrategy Theory Revision: Induction and Abduction in INTHELEX , 2004, Machine Learning.

[70]  Joseph Y. Halpern An Analysis of First-Order Logics of Probability , 1989, IJCAI.

[71]  Lyle H. Ungar,et al.  Structural Logistic Regression for Link Analysis , 2003 .

[72]  Stephen Muggleton,et al.  Abductive Stochastic Logic Programs for Metabolic Network Inhibition Learning , 2007, MLG.

[73]  Fabrizio Riguzzi,et al.  Abductive concept learning , 2000, New Generation Computing.

[74]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[75]  Taisuke Sato,et al.  PRISM: A Language for Symbolic-Statistical Modeling , 1997, IJCAI.