论文信息 - Improved Statistical Alignment Models

Improved Statistical Alignment Models

In this paper, we present and compare various single-word based alignment models for statistical machine translation. We discuss the five IBM alignment models, the Hidden-Markov alignment model, smoothing techniques and various modifications. We present different methods to combine alignments. As evaluation criterion we use the quality of the resulting Viterbi alignment compared to a manually produced reference alignment. We show that models with a first-order dependence and a fertility model lead to significantly better results than the simple models IBM-1 or IBM-2, which are not able to go beyond zero-order dependencies.

Hermann Ney | Franz Josef Och | H. Ney | F. Och

[1] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2] Hermann Ney,et al. Forming Word Classes by Statistical Clustering for Statistical Language Modelling , 1993 .

[3] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4] Robert L. Mercer,et al. But Dictionaries Are Data Too , 1993, HLT.

[5] Hermann Ney,et al. HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[6] I. Dan Melamed,et al. Manual Annotation of Translational Equivalence: The Blinker Project , 1998, ArXiv.

[7] Jörg Tiedemann,et al. Evaluation of Word Alignment Systems , 2000, LREC.

[8] Hermann Ney,et al. A Comparison of Alignment Models for Statistical Machine Translation , 2000, COLING.