Online Structure Learning for Markov Logic Networks

Most existing learning methods for Markov Logic Networks (MLNs) use batch training, which becomes computationally expensive and eventually infeasible for large datasets with thousands of training examples which may not even all fit in main memory. To address this issue, previous work has used online learning to train MLNs. However, they all assume that the model's structure (set of logical clauses) is given, and only learn the model's parameters. However, the input structure is usually incomplete, so it should also be updated. In this work, we present OSL--the first algorithm that performs both online structure and parameter learning for MLNs. Experimental results on two realworld datasets for natural-language field segmentation show that OSL outperforms systems that cannot revise structure.

[1]  C. Lee Giles,et al.  Autonomous citation matching , 1999, AGENTS '99.

[2]  Raymond J. Mooney,et al.  Max-Margin Weight Learning for Markov Logic Networks , 2009, ECML/PKDD.

[3]  Eric P. Xing,et al.  Grafting-light: fast, incremental feature selection and structure learning of Markov random fields , 2010, KDD '10.

[4]  Pedro M. Domingos,et al.  Learning Markov logic network structure via hypergraph lifting , 2009, ICML '09.

[5]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[6]  Raymond J. Mooney,et al.  Learning to Disambiguate Search Queries from Short Sessions , 2009, ECML/PKDD.

[7]  Dan Klein,et al.  Unsupervised Learning of Field Segmentation Models for Information Extraction , 2005, ACL.

[8]  Gerson Zaverucha,et al.  Using the Bottom Clause and Mode Declarations on FOL Theory Revision from Examples , 2008, ILP.

[9]  Pieter H. Hartel,et al.  Programming Languages: Implementations, Logics, and Programs , 1996, Lecture Notes in Computer Science.

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[12]  Andrew McCallum,et al.  Efficiently Inducing Features of Conditional Random Fields , 2002, UAI.

[13]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[15]  M. Cali,et al.  Inducing logic programs without explicit negative examples , 1995 .

[16]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[17]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.

[18]  Fei-Fei Li,et al.  What, Where and Who? Telling the Story of an Image by Activity Classification, Scene Recognition and Object Categorization , 2010, Computer Vision: Detection, Recognition and Reconstruction.

[19]  Mark Craven,et al.  Combining Statistical and Relational Methods for Learning in Hypertext Domains , 1998, ILP.

[20]  Peter A. Flach,et al.  Evaluation Measures for Multi-class Subgroup Discovery , 2009, ECML/PKDD.

[21]  Raymond J. Mooney,et al.  Online Max-Margin Weight Learning for Markov Logic Networks , 2011, SDM.

[22]  Iván V. Meza,et al.  Collective Semantic Role Labelling with Markov Logic , 2008, CoNLL.

[23]  Raymond J. Mooney,et al.  Learning Relations by Pathfinding , 1992, AAAI.

[24]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[25]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[26]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[27]  Ben Taskar,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[28]  David Page,et al.  Mode Directed Path Finding , 2005, ECML.

[29]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[30]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[31]  Pedro M. Domingos,et al.  Joint Inference in Information Extraction , 2007, AAAI.

[32]  Raymond J. Mooney,et al.  Mapping and Revising Markov Logic Networks for Transfer Learning , 2007, AAAI.

[33]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[34]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[35]  Raymond J. Mooney,et al.  Bottom-up learning of Markov logic network structure , 2007, ICML '07.

[36]  Xavier Carreras,et al.  Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling , 2005, CoNLL.

[37]  Stefano Ferilli,et al.  Discriminative Structure Learning of Markov Logic Networks , 2008, ILP.

[38]  Pedro M. Domingos,et al.  Markov Logic: An Interface Layer for Artificial Intelligence , 2009, Markov Logic: An Interface Layer for Artificial Intelligence.

[39]  Pedro M. Domingos,et al.  Learning Markov Logic Networks Using Structural Motifs , 2010, ICML.

[40]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .