Using Semantic Structure to Improve Chinese-English Term Translation

This paper introduces a method which aims at translating Chinese terms into English. Our motivation is providing deep semantic-level information for term translation through analyzing the semantic structure of terms. Using the contextual information in the term and the first sememe of each word in HowNet as features, we trained a Support Vector Machine (SVM) model to identify the dependencies among words in a term. Then a Conditional Random Field (CRF) model is trained to mark semantic relations for term dependencies. During translation, the semantic relations within the Chinese terms are identified and three features based on semantic structure are integrated into the phrase-based statistical machine translation system. Experimental results show that the proposed method achieves 1.58 BLEU points improvement in comparison with the baseline system.

[1]  Jinling Wang,et al.  Research on Japanese-Chinese Term Translation Technique Based on Multi-Features , 2009, 2009 Chinese Conference on Pattern Recognition.

[2]  Qiang Dong,et al.  Hownet and the Computation of Meaning: (With CD-ROM) , 2006 .

[3]  Sergei Nirenburg,et al.  Semantic Analysis in The Mikrokosmos Machine Translation Project , 1996 .

[4]  Qun Liu,et al.  基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net) [In Chinese] , 2002, ROCLING/IJCLCLP.

[5]  Hang Li,et al.  Base Noun Phrase Translation Using Web Data and the EM Algorithm , 2002, COLING.

[6]  Keita Tsuji Automatic Extraction of Translational Japanese-KATAKANA and English Word Pairs , 2002, Int. J. Comput. Process. Orient. Lang..

[7]  Hao Yu,et al.  Chinese-English Term Translation Mining Based on Semantic Prediction , 2006, ACL.

[8]  Yi-Rong Chen,et al.  Translating multi word terms into Korean from Chinese documents , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[9]  Junichi Tsujii,et al.  Improving English-to-Chinese Translation for Technical Terms using Morphological Information , 2008, AMTA.

[10]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[11]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[12]  Jingbo Zhu,et al.  NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation , 2012, ACL.