A Chinese-English patent machine translation system based on the theory of hierarchical network of concepts

Abstract Compared with ordinary text, patent text often has more complex sentence structure and more ambiguity of multiple verbs. To deal with these problems, this paper presents a rule-based Chinese-English patent machine translation (MT) system based on the theory of hierarchical network of concepts (HNC). In this system, the whole procedure are divided into three main parts, the semantic analysis of the source language, the transitional transformation from the source language to the target language and the generation of the target language. The knowledge base and the rule set are obtained from manually analyzing the semantic features of a training set which contains more than 6 000 Chinese patent sentences, and a specific method of evaluation is provided during the experiment.

[1]  Timo Järvinen,et al.  A non-projective dependency parser , 1997, ANLP.

[2]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[3]  Sui Zhifang The Acquisition and Application of the Knowledge for Recognizing the Predicate Head of a Chinese Simple Sentence , 1998 .

[4]  Hermann Ney,et al.  Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation , 2007, SSST@HLT-NAACL.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Chao Wang,et al.  Chinese Syntactic Reordering for Statistical Machine Translation , 2007, EMNLP.

[7]  Dai Xin Machine Translation: Past, Present, Future , 2004 .

[8]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[9]  Masao Utiyama,et al.  Overview of the Patent Translation Task at the NTCIR-7 Workshop , 2008, NTCIR.

[10]  Zhang Quan Research on automatic acquiring head verb of Chinese sentences , 2007 .

[11]  Gong Xiao Recognizing the Predicate Head of Chinese Sentences , 2003 .

[12]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[13]  Yaohong Jin A hybrid-strategy method combining semantic analysis with rule-based MT for patent machine translation , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[14]  Ming Zhou A Block-Based Robust Dependency Parser for Unrestricted Chinese Text , 1999, ACL 2000.

[15]  Yuji Matsumoto,et al.  Japanese Dependency Structure Analysis Based on Support Vector Machines , 2000, EMNLP.

[16]  Mikel L. Forcada,et al.  Shallow parsing for Portuguese-Spanish machine translation , 2003 .