Improving the Accuracy and Efficiency of MAP Inference for Markov Logic

In this work we present Cutting Plane Inference (CPI), a Maximum A Posteriori (MAP) inference method for Statistical Relational Learning. Framed in terms of Markov Logic and inspired by the Cutting Plane Method, it can be seen as a meta algorithm that instantiates small parts of a large and complex Markov Network and then solves these using a conventional MAP method. We evaluate CPI on two tasks, Semantic Role Labelling and Joint Entity Resolution, while plugging in two different MAP inference methods: the current method of choice for MAP inference in Markov Logic, MaxWalkSAT, and Integer Linear Programming. We observe that when used with CPI both methods are significantly faster than when used alone. In addition, CPI improves the accuracy of MaxWalkSAT and maintains the exactness of Integer Linear Programming.

[1]  Xavier Carreras,et al.  Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling , 2005, CoNLL.

[2]  Bart Selman,et al.  A general stochastic approach to solving problems with hard and soft constraints , 1996, Satisfiability Problem: Theory and Applications.

[3]  Sebastian Riedel,et al.  Incremental Integer Linear Programming for Non-projective Dependency Parsing , 2006, EMNLP.

[4]  Tommi S. Jaakkola,et al.  New Outer Bounds on the Marginal Polytope , 2007, NIPS.

[5]  Pedro M. Domingos,et al.  Memory-Efficient Inference in Relational Domains , 2006, AAAI.

[6]  Sebastian Thrun,et al.  The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces , 2004, NIPS.

[7]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .

[8]  Dan Roth,et al.  Lifted First-Order Probabilistic Inference , 2005, IJCAI.

[9]  Pedro M. Domingos,et al.  Sound and Efficient Inference with Probabilistic and Deterministic Dependencies , 2006, AAAI.

[10]  Pedro M. Domingos,et al.  Discriminative Training of Markov Logic Networks , 2005, AAAI.

[11]  John D. C. Little,et al.  On model building , 1993 .

[12]  Dan Roth,et al.  Integer linear programming inference for conditional random fields , 2005, ICML.

[13]  James D. Park,et al.  MAP Complexity Results and Approximation Methods , 2002, UAI.

[14]  Raymond J. Mooney,et al.  Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.

[15]  Thomas Schwentick,et al.  When is the evaluation of conjunctive queries tractable? , 2001, STOC '01.

[16]  Y. Singer,et al.  Ultraconservative online algorithms for multiclass problems , 2003 .

[17]  Pedro M. Domingos,et al.  Joint Inference in Information Extraction , 2007, AAAI.

[18]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[19]  Dan Roth,et al.  Generalized Inference with Multiple Semantic Role Labeling Systems , 2005, CoNLL.

[20]  Richard C. Larson,et al.  Model Building in Mathematical Programming , 1979 .

[21]  William J. Cook,et al.  Solution of a Large-Scale Traveling-Salesman Problem , 1954, 50 Years of Integer Programming.

[22]  Mirella Lapata,et al.  Modelling Compression with Discourse Constraints , 2007, EMNLP.

[23]  Thomas Lukasiewicz,et al.  Probabilistic Logic Programming , 1998, ECAI.

[24]  Pedro M. Domingos,et al.  Entity Resolution with Markov Logic , 2006, Sixth International Conference on Data Mining (ICDM'06).

[25]  Ben Taskar,et al.  Probabilistic Relational Models , 2014, Encyclopedia of Social Network Analysis and Mining.

[26]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[27]  Jason Eisner,et al.  A fast finite-state relaxation method for enforcing global constraints on sequence decoding , 2006, NAACL.

[28]  Pedro M. Domingos What ’ s Missing in AI : The Interface Layer , .