Automatic Construction of Domain Terminology Knowledge Base for HowNet Based on the Headword

HowNet is a Chinese-English Bilingual common-sense knowledge base, playing an important role in machine translation tasks. However, when facing domain-specific machine translation tasks, HowNet must be supplemented with domain-specific terminologies. In other words, we need to construct domain terminology semantic knowledge base. In this paper, we propose a method to automatically construct domain terminology knowledge base, based on the headword of a terminology. Specifically, the semantic meaning (HowNet DEF) of an unseen terminology is defined as one of the semantic meanings of the headword of the terminology. Headword disambiguation is done by considering the context of headwords and adding domain-specific disambiguation rules to the general disambiguation rules. Experiments on aviation domain show that our proposed method on headword disambiguation achieves 9.4% improvement based on the default disambiguation tools in HowNet. We also find that with our automatically constructed domain terminology knowledge base, HowNet machine translation system achieves better translation quality.