论文信息 - Improving Statistical Natural Language Translation with Categories and Rules

Improving Statistical Natural Language Translation with Categories and Rules

This paper describes an all level approach on statistical natural language translation (SNLT). Without any predefined knowledge the system learns a statistical translation lexicon (STL), word classes (WCs) and translation rules (TRs) from a parallel corpus thereby producing a generalized form of a word alignment (WA). The translation process itself is realized as a beam search. In our method example-based techniques enter an overall statistical approach leading to about 50 percent correctly translated sentences applied to the very difficult English-German VERBMOBIL spontaneous speech corpus.

Franz Josef Och | Hans Weber | F. Och | H. Weber

[1] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2] Lynn Wilcox,et al. Acoustic pattern matching and beam searching , 1982, ICASSP.

[3] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4] Hermann Ney,et al. Improved clustering techniques for class-based statistical language modelling , 1993, EUROSPEECH.

[5] Pascale Fung,et al. Coerced Markov Models for Cross-Lingual Lexical-Tag Relations , 1995, TMI.

[6] Hermann Ney,et al. HMM-Based Word Alignment in Statistical Translation , 1996, COLING.