论文信息 - Ontology Construction and Its Application to Disambiguate Word Senses

Ontology Construction and Its Application to Disambiguate Word Senses

This paper presents an ontology construction method using various computational language resources, and an ontology-based word sense disambiguation method. In order to acquire a reasonably practical ontology the Kadokawa thesaurus is extended by inserting additional semantic relations into its hierarchy, which are classified as case relations and other semantic relations. To apply the ontology to disambiguate word senses, we apply the previously-secured dictionary information to select the correct senses of some ambiguous words with high precision, and then use the ontology to disambiguate the remaining ambiguous words. The mutual information between concepts in the ontology was calculated before using the ontology as knowledge for disambiguating word senses. If mutual information is regarded as a weight between ontology concepts, the ontology can be treated as a graph with weighted edges, and then we locate the weighted path from one concept to the other concept. In our practical machine translation system, our word sense disambiguation method achieved a 9% improvement over methods which do not use ontology for Korean translation.

Sin-Jae Kang

[1] Ramanathan V. Guha,et al. Cyc: toward programs with common sense , 1990, CACM.

[2] Gary Geunbae Lee,et al. Lexical Transfer Ambiguity Resolution Using Automatically-Extracted Concept Co-occurrence Information , 2000, Int. J. Comput. Process. Orient. Lang..

[3] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[4] David Yarowsky,et al. Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[5] Jong-Hyeok Lee,et al. Representation and Recognition Method for Multi-Word Translation Units in Korean-to-Japanese MT System , 2000, COLING.

[6] George A. Miller,et al. Introduction to WordNet: An On-line Lexical Database , 1990 .