Improved HMM Alignment Models for Languages with Scarce Resources

We introduce improvements to statistical word alignment based on the Hidden Markov Model. One improvement incorporates syntactic knowledge. Results on the workshop data show that alignment performance exceeds that of a state-of-the art system based on more complex models, resulting in over a 5.5% absolute reduction in error on Romanian-English.

[1]  Robert C. Moore Improving IBM Word Alignment Model 1 , 2004, ACL.

[2]  Hermann Ney,et al.  A Comparison of Alignment Models for Statistical Machine Translation , 2000, COLING.

[3]  Dekang Lin,et al.  Dependency-Based Evaluation of Minipar , 2003 .

[4]  Colin Cherry,et al.  A Probability Model to Improve Word Alignment , 2003, ACL.

[5]  Christopher D. Manning,et al.  Extentions to HMM-based Statistical Word Alignment Models , 2002, EMNLP.

[6]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[7]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[8]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[9]  Ted Pedersen,et al.  An Evaluation Exercise for Word Alignment , 2003, ParallelTexts@NAACL-HLT.

[10]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[11]  NeyHermann,et al.  A systematic comparison of various statistical alignment models , 2003 .

[12]  Éric Gaussier,et al.  Reducing Parameter Space for Word Alignment , 2003, ParallelTexts@NAACL-HLT.

[13]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora , 1995, IJCAI.

[14]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[15]  David Yarowsky,et al.  Statistical Machine Translation: Final Report , 1999 .