Discriminative Models for Automatic Acquisition of Translation Equivalences

Translation equivalence is very important for bilingual lexicography, machine translation system and cross-lingual information retrieval. Extraction of equivalences from bilingual sentence pairs belongs to data mining problem. In this paper, discriminative learning methods are employed to filter translation equivalences. Discriminative features including translation literality, phrase alignment probability, and phrase length ratio are used to evaluate equivalences. 1000 equivalences randomly selected are filtered and then evaluated. Experimental results indicate that its precision is 87.8% and recall is 89.8% for support vector machine.

[1]  Kenneth Ward Church,et al.  Identifying word correspondence in parallel texts , 1991 .

[2]  Kenneth Ward Church Char_align: A Program for Aligning Parallel Texts at the Character Level , 1993, ACL.

[3]  John Shawe-Taylor,et al.  The Perceptron Algorithm with Uneven Margins , 2002, ICML.

[4]  Douglas W. Oard,et al.  A survey of multilingual text retrieval , 1996 .

[5]  Ming-Chui Dong,et al.  A Flexible Example Annotation Schema: Translation Corresponding Tree Representation , 2004, COLING.

[6]  Y. Zhang,et al.  Integrated phrase segmentation and alignment algorithm for statistical machine translation , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[7]  Hiroyuki Kaji,et al.  Learning Translation Templates From Bilingual Text , 1992, COLING.

[8]  Eiichiro Sumita,et al.  Bilingual corpus cleaning focusing on translation literality , 2002, INTERSPEECH.

[9]  I. Çiçekli,et al.  Learning Translation Templates from Bilingual Texts � , 2008 .

[10]  Kenneth Ward Church,et al.  Identifying Word Correspondences in Parallel Texts , 1991, HLT.

[11]  Sung Kyung Hong,et al.  Fault Detection System Design and HILS Evaluation for the Smart UAV FCS , 2007 .

[12]  PietraVincent J. Della,et al.  The mathematics of statistical machine translation , 1993 .

[13]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[14]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.