Bilingual Text, Matching using Bilingual Dictionary and Statistics

This paper describes a unified framework for bilingual text matching by combining existing hand-written bilingual dictionaries and statistical techniques. The process of bilingual text matching consists of two major steps: sentence alignment and structural matching of bilingual sentences. Statistical techniques are applied to estimate word correspondences not included in bilingual dictionaries. Estimated word correspondences are useful for improving both sentence alignment and structural matching.

[1]  Kenneth Ward Church,et al.  Identifying Word Correspondences in Parallel Texts , 1991, HLT.

[2]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[3]  Yuji Matsumoto,et al.  Sructural Matching of Parallel Texts , 1993, ACL.

[4]  Alon Itai,et al.  Two Languages Are More Informative Than One , 1991, ACL.

[5]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[6]  Yuji Matsumoto,et al.  Verbal Case Frame Acquisition from Bilingual Corpora , 1993, IJCAI.

[7]  Stanley F. Chen,et al.  Aligning Sentences in Bilingual Corpora Using Lexical Information , 1993, ACL.

[8]  Julian Kupiec,et al.  An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora , 1993, ACL.

[9]  Robert L. Mercer,et al.  Aligning Sentences in Parallel Corpora , 1991, ACL.

[10]  Kenneth Ward Church Char_align: A Program for Aligning Parallel Texts at the Character Level , 1993, ACL.

[11]  Martin Kay,et al.  Text-Translation Alignment , 1993, Comput. Linguistics.

[12]  Hiroyuki Kaji,et al.  Learning Translation Templates From Bilingual Text , 1992, COLING.

[13]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[14]  Yuji Matsumoto,et al.  Lexical Knowledge Acquisition from Bilingual Corpora , 1992, COLING.

[15]  Evelyne Tzoukermann,et al.  The BICORD System Combining Lexical Information from Bilingual Corpora and Machine Readable Dictionaries , 1990, COLING.

[16]  Victor Sadler,et al.  Pilot Implementation of a Bilingual Knowledge Bank , 1990, COLING.

[17]  Kenneth Ward Church,et al.  Identifying word correspondence in parallel texts , 1991 .