Improving Translation of Organization Names Combining Translation Model and Web Mining

Named entity (NE) translation is a fundamental task in machine translation (MT) and cross-language information retrieval (CLIR). Furthermore, Organization name (ON) translation is the most complex among all the NEs. A novel system for translating ONs from Chinese to English, with a translation model and web resources, is proposed. Firstly, we built a translation model with Chunk. Then query expansion was adopted with the translation model and term-subject co-occurrence. Thirdly, we extracted the Chinese Organization names with English sentences using the method of frequency shifting and adjacency information to find English fragments as translation candidates. Finally, we found the best translation by computing the trustworthiness of all candidates. The experimental results showed that the approach returned a better performance than machine translation-based systems.

[1]  Fan Yang,et al.  A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment , 2009, ACL/IJCNLP.

[2]  Kevin Knight,et al.  Translating Names and Technical Terms in Arabic Text , 1998, SEMITIC@COLING.

[3]  Hsin-Hsi Chen,et al.  Learning Formulation and Transformation Rules for Multilingual Named Entities , 2003, NER@ACL.

[4]  Chengqing Zong,et al.  A Structure-Based Model for Chinese Organization Name Translation , 2008, TALIP.

[5]  Benno Stein,et al.  A Wikipedia-Based Multilingual Retrieval Model , 2008, ECIR.

[6]  Ying Liu The Technical Analyses of Named Entity Translation , 2015 .

[7]  Bin Li,et al.  Chinese-English Translation of Organization Names Based on a Translation Model and Web Mining , 2015 .

[8]  Seung-won Hwang,et al.  Bootstrapping Entity Translation on Weakly Comparable Corpora , 2013, ACL.

[9]  Jian Su,et al.  A Phrase-Based Context-Dependent Joint Probability Model for Named Entity Translation , 2005, IJCNLP.

[10]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[11]  Lamia Hadrich Belguith,et al.  Mining Named Entity Translation from Non Parallel Corpora , 2014, FLAIRS Conference.

[12]  R Sathyaraj,et al.  Named Entity Recognition by Using Maximum Entropy , 2015 .

[13]  Pu-Jen Cheng,et al.  Translating unknown queries with web corpora for cross-language information retrieval , 2004, SIGIR '04.

[14]  Bruno Pouliquen,et al.  Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC , 2002, CICLing.

[15]  K. Saravanan,et al.  "They Are Out There, If You Know Where to Look": Mining Transliterations of OOV Query Terms for Cross-Language Information Retrieval , 2009, ECIR.

[16]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[17]  Yue Xu,et al.  Web-Based Query Translation for English-Chinese CLIR , 2008, Int. J. Comput. Linguistics Chin. Lang. Process..

[18]  Qiaoming Zhu,et al.  Improving Web-Based OOV Translation Mining for Query Translation , 2010, AIRS.

[19]  Hany Hassan,et al.  An Integrated Approach for Arabic-English Named Entity Translation , 2005, SEMITIC@ACL.