On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing

This paper introduces dual decomposition as a framework for deriving inference algorithms for NLP problems. The approach relies on standard dynamic-programming algorithms as oracle solvers for sub-problems, together with a simple method for forcing agreement between the different oracles. The approach provably solves a linear programming (LP) relaxation of the global inference problem. It leads to algorithms that are simple, in that they use existing decoding algorithms; efficient, that they avoid exact algorithms for the full model; and often exact, in that empirically they often recover the correct solution in spite of using an LP relaxation. We give experimental results on two problems: 1) the combination of two lexicalized parsing models; and 2) the combination of a lexicalized parsing model and a trigram part-of-speech tagger.

[1]  George B. Dantzig,et al.  Decomposition Principle for Linear Programs , 1960 .

[2]  Ronald L. Rardin,et al.  Polyhedral Characterization of Discrete Dynamic Programming , 1990, Oper. Res..

[3]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[4]  Hanif D. Sherali,et al.  A Hierarchy of Relaxations and Convex Hull Characterizations for Mixed-integer Zero-one Programming Problems , 1994, Discret. Appl. Math..

[5]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[6]  Jason Eisner,et al.  Bilexical Grammars and their Cubic-Time Parsing Algorithms , 2000 .

[7]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[8]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[9]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[10]  Ben Taskar,et al.  Max-Margin Parsing , 2004, EMNLP.

[11]  Dan Roth,et al.  Integer linear programming inference for conditional random fields , 2005, ICML.

[12]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[13]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[14]  Martin J. Wainwright,et al.  A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.

[15]  Sebastian Riedel,et al.  Incremental Integer Linear Programming for Non-projective Dependency Parsing , 2006, EMNLP.

[16]  Daniel Tarlow,et al.  Using Combinatorial Optimization within Max-Product Belief Propagation , 2006, NIPS.

[17]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[18]  Nikos Komodakis,et al.  MRF Optimization via Dual Decomposition: Message-Passing Revisited , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  Tommi S. Jaakkola,et al.  Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[20]  Jens Vygen,et al.  The Book Review Column1 , 2020, SIGACT News.

[21]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[22]  Xavier Carreras,et al.  TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-Rich Parsing , 2008, CoNLL.

[23]  Tommi S. Jaakkola,et al.  Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[24]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[25]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[26]  David A. Smith,et al.  Dependency Parsing by Belief Propagation , 2008, EMNLP.

[27]  Eric P. Xing,et al.  Concise Integer Linear Programming Formulations for Dependency Parsing , 2009, ACL.

[28]  Asuman E. Ozdaglar,et al.  Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods , 2008, SIAM J. Optim..

[29]  Alexander M. Rush,et al.  Dual Decomposition for Parsing with Non-Projective Head Automata , 2010, EMNLP.