On Amortizing Inference Cost for Structured Prediction

This paper deals with the problem of predicting structures in the context of NLP. Typically, in structured prediction, an inference procedure is applied to each example independently of the others. In this paper, we seek to optimize the time complexity of inference over entire datasets, rather than individual examples. By considering the general inference representation provided by integer linear programs, we propose three exact inference theorems which allow us to re-use earlier solutions for certain instances, thereby completely avoiding possibly expensive calls to the inference procedure. We also identify several approximation schemes which can provide further speedup. We instantiate these ideas to the structured prediction task of semantic role labeling and show that we can achieve a speedup of over 2.5 using our approach while retaining the guarantees of exactness and a further speedup of over 3 using approximations that do not degrade performance.

[1]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[2]  L. Getoor,et al.  1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[3]  Alexander M. Rush,et al.  On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing , 2010, EMNLP.

[4]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[5]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[6]  D. Roth 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[7]  Cícero Nogueira dos Santos,et al.  Semantic Role Labeling , 2012 .

[8]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[9]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[10]  Dan Roth,et al.  Lifted First-Order Probabilistic Inference , 2005, IJCAI.

[11]  Mirella Lapata,et al.  Constraint-Based Sentence Compression: An Integer Programming Approach , 2006, ACL.

[12]  Sebastian Riedel Cutting Plane MAP Inference for Markov Logic , 2009 .

[13]  Ming-Wei Chang,et al.  Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[14]  Sebastian Riedel,et al.  Incremental Integer Linear Programming for Non-projective Dependency Parsing , 2006, EMNLP.

[15]  Eric P. Xing,et al.  Concise Integer Linear Programming Formulations for Dependency Parsing , 2009, ACL.

[16]  Michael Collins,et al.  Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation , 2011, EMNLP.

[17]  Dan Roth,et al.  The Necessity of Syntactic Parsing for Semantic Role Labeling , 2005, IJCAI.