Extended-HowNet: A Representational Framework for Concepts

Natural languages are means to denote concepts. However word sense ambiguities make natural language processing and conceptual processing almost impossible. To bridge the gaps between natural language representations and conceptual representations, we propose a universal concept representational mechanism, called Extended-HowNet, which was evolved from HowNet. It extends the word sense definition mechanism of HowNet and uses WordNet synsets as vocabulary to describe concepts. Each word sense (or concept) is defined by some simpler concepts. The simple concepts used in the definitions can be further decomposed into even simpler concepts, until primitive or basic concepts are reached. Therefore the definition of a concept can be dynamically decomposed and unified into Extended-HowNet at different levels of representations. Extended-HowNet are language independent. Any word sense of any language can be defined and achieved near-canonical representation. For any two concepts, not only their semantic distances but also their sense similarity and difference are known by checking their definitions. In addition to taxonomy links, concepts are also associated by their shared conceptual features. Fine-grain differences among near-synonyms can be differentiated by adding new features.