Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information

In statistical machine translation, correspondences between the words in the source and the target language are learned from parallel corpora, and often little or no linguistic knowledge is used to structure the underlying models. In particular, existing statistical systems for machine translation often treat different inflected forms of the same lemma as if they were independent of one another. The bilingual training data can be better exploited by explicitly taking into account the interdependencies of related inflected forms. We propose the construction of hierarchical lexicon models on the basis of equivalence classes of words. In addition, we introduce sentence-level restructuring transformations which aim at the assimilation of word order in related sentences. We have systematically investigated the amount of bilingual training data required to maintain an acceptable quality of machine translation. The combination of the suggested methods for improving translation quality in frameworks with scarce resources has been successfully tested: We were able to reduce the amount of bilingual training data to less than 10 of the original corpus, while losing only 1.6 in translation quality. The improvement of the translation results is demonstrated on two German-English corpora taken from the Verbmobil task and the Nespole! task.

[1]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[2]  John Cocke,et al.  A Statistical Approach to Language Translation , 1988, COLING.

[3]  Fred Karlsson,et al.  Constraint Grammar as a Framework for Parsing Running Text , 1990, COLING.

[4]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[5]  J. Cocke,et al.  A Statistical Approach to Machine , 1990 .

[6]  John D. Lafferty,et al.  Analysis, statistical transfer, and synthesis in machine translation , 1992, TMI.

[7]  Giulio Maltese,et al.  An automatic technique to include grammatical and morphological information in a trigram-based statistical language model , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[9]  Robert L. Mercer,et al.  But Dictionaries Are Data Too , 1993, HLT.

[10]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Adwait Ratnaparkhi,et al.  A Simple Introduction to Maximum Entropy Models for Natural Language Processing , 1997 .

[12]  Franz Josef Och,et al.  Improving Statistical Natural Language Translation with Categories and Rules , 1998, ACL.

[13]  Hermann Ney,et al.  A DP based Search Algorithm for Statistical Machine Translation , 1998, COLING-ACL.

[14]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[15]  Hermann Ney,et al.  Improving SMT quality with morpho-syntactic analysis , 2000, COLING.

[16]  Hermann Ney,et al.  Algorithms for statistical translation of spoken language , 2000, IEEE Trans. Speech Audio Process..

[17]  Martha Larson,et al.  Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches , 2000, INTERSPEECH.

[18]  Yaser Al-Onaizan,et al.  Translating with Scarce Resources , 2000, AAAI/IAAI.

[19]  George F. Foster A Maximum Entropy/Minimum Divergence Translation Model , 2000, ACL.

[20]  Hermann Ney,et al.  Word Re-ordering and DP-based Search in Statistical Machine Translation , 2000, COLING.

[21]  Hermann Ney,et al.  Morpho-syntactic analysis for reordering in statistical machine translation , 2001, MTSUMMIT.

[22]  Philipp Koehn,et al.  Knowledge Sources for Word-Level Translation Models , 2001, EMNLP.

[23]  Francisco Casacuberta,et al.  Search algorithms for statistical machine translation based on dynamic programming and pruning techniques , 2001 .

[24]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[25]  Fabio Pianesi,et al.  Architecture and Design Considerations in NESPOLE!: a Speech Translation System for E-commerce Applications , 2001, HLT.

[26]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[27]  Franz Josef Och,et al.  Statistical machine translation: from single word models to alignment templates , 2002 .