Word Alignment via Submodular Maximization over Matroids

We cast the word alignment problem as maximizing a submodular function under matroid constraints. Our framework is able to express complex interactions between alignment components while remaining computationally efficient, thanks to the power and generality of submodular functions. We show that submodularity naturally arises when modeling word fertility. Experiments on the English-French Hansards alignment task show that our approach achieves lower alignment error rates compared to conventional matching based approaches.

[1]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[2]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[3]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[5]  I. Dan Melamed,et al.  Models of translation equivalence among words , 2000, CL.

[6]  Jack Edmonds,et al.  Submodular Functions, Matroids, and Certain Polyhedra , 2001, Combinatorial Optimization.

[7]  Ted Pedersen,et al.  An Evaluation Exercise for Word Alignment , 2003, ParallelTexts@NAACL-HLT.

[8]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[9]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[10]  Hermann Ney,et al.  Symmetric Word Alignments for Statistical Machine Translation , 2004, COLING.

[11]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Jeff A. Bilmes,et al.  PAC-learning Bounded Tree-width Graphical Models , 2004, UAI.

[13]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[14]  Ben Taskar,et al.  A Discriminative Matching Approach to Word Alignment , 2005, HLT.

[15]  Jeff A. Bilmes,et al.  A Submodular-supermodular Procedure with Applications to Discriminative Structure Learning , 2005, UAI.

[16]  Ben Taskar,et al.  Word Alignment via Quadratic Assignment , 2006, NAACL.

[17]  Jeff A. Bilmes,et al.  Local Search for Balanced Submodular Clusterings , 2007, IJCAI.

[18]  Jan Vondrák,et al.  Optimal approximation for the submodular welfare problem in the value oracle model , 2008, STOC.

[19]  H. B. McMahan,et al.  Robust Submodular Observation Selection , 2008 .

[20]  Jan Vondrák,et al.  Submodular Maximization over Multiple Matroids via Generalized Exchange Properties , 2009, Math. Oper. Res..

[21]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[22]  Jeff A. Bilmes,et al.  Submodularity beyond submodular energies: Coupling edges in graph cuts , 2011, CVPR 2011.

[23]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.