Hierarchical Search for Word Alignment

We present a simple yet powerful hierarchical search algorithm for automatic word alignment. Our algorithm induces a forest of alignments from which we can efficiently extract a ranked k-best list. We score a given alignment within the forest with a flexible, linear discriminative model incorporating hundreds of features, and trained on a relatively small amount of annotated data. We report results on Arabic-English word alignment and translation tasks. Our model outperforms a GIZA++ Model-4 baseline by 6.3 points in F-measure, yielding a 1.1 Bleu score increase over a state-of-the-art syntax-based machine translation system.

[1]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Alexander M. Fraser,et al.  Getting the Structure Right for Word Alignment: LEAF , 2007, EMNLP-CoNLL.

[4]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[5]  Yang Liu,et al.  Log-Linear Models for Word Alignment , 2005, ACL.

[6]  Kevin Knight,et al.  Using Syntax to Improve Word Alignment Precision for Syntax-Based Machine Translation , 2008, WMT@ACL.

[7]  Ben Taskar,et al.  A Discriminative Matching Approach to Word Alignment , 2005, HLT.

[8]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[9]  Ben Taskar,et al.  Word Alignment via Quadratic Assignment , 2006, NAACL.

[10]  John DeNero,et al.  Tailoring Word Alignments to Syntactic Machine Translation , 2007, ACL.

[11]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[12]  Dan Klein,et al.  Parsing and Hypergraphs , 2001, IWPT.

[13]  Mirella Lapata,et al.  Proceedings of ACL-08: HLT , 2008 .

[14]  David Chiang,et al.  Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[15]  WuDekai Stochastic inversion transduction grammars and bilingual parsing of parallel corpora , 1997 .

[16]  Phil Blunsom,et al.  Discriminative Word Alignment with Conditional Random Fields , 2006, ACL.

[17]  Andreas Bode,et al.  Improved Discriminative Bilingual Word Alignment , 2006, ACL.

[18]  Salim Roukos,et al.  A Maximum Entropy Word Aligner for Arabic-English Machine Translation , 2005, HLT.

[19]  Robert C. Moore A Discriminative Framework for Bilingual Word Alignment , 2005, HLT.

[20]  NeyHermann,et al.  A systematic comparison of various statistical alignment models , 2003 .

[21]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[22]  John DeNero,et al.  Better Word Alignments with Supervised ITG Models , 2009, ACL.

[23]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[24]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[25]  Colin Cherry,et al.  Soft Syntactic Constraints for Word Alignment through Discriminative Training , 2006, ACL.

[26]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[27]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[28]  Thorsten Brants,et al.  Randomized Language Models via Perfect Hash Functions , 2008, ACL.

[29]  J. Ivey,et al.  Ann Arbor, Michigan , 1969 .

[30]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[31]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[32]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.