Training Statistical Machine Translation with Multivariate Mutual Information

In this paper, we describe a new model for phrase-based statistical machine translation. Roughly speaking, statistical approach uses a language and a translation model. This latter could be viewed as a lexical and an alignment model. The approach we propose does not need any alignment, it is based on inter-lingual triggers determined by multivariate mutual information (MMI). This measure depends on conditional mutual information, this means that a source phrase is directly linked to a target one. The conditional mutual information is used in both directions (source-target and target-source languages). We present an experimental evaluation conducted on EUROPARL corpora (French and English) and using the decoder MOSES. We compare then our results to those of a previous work in which we used inter-lingual triggers determined by a simple mutual information (MI) as well as to those given by baseline model (Koehn et al., 2003).