Ontology and the Lexicon: Formal ontology as interlingua: the SUMO and WordNet linking project and global WordNet

WordNet1 is a large lexical database for English. With its broad coverage and a design that is useful for a range of natural-language processing applications, this resource has found wide general acceptance. We offer only a brief description here and refer the reader to Miller, 1990 and Fellbaum, 1998 for further details. WordNet’s creation in the mid-1980s was motivated by current theories of human semantic organization (Collins and Quillian, 1969). People have knowledge about tens of thousands of concepts, and the words expressing these concepts must be stored and retrieved in an efficient and economic fashion. A semantic network such as WordNet is an attempt to model one way in which concepts and words could be organized. The basic unit of WordNet is a set of cognitively equivalent synonyms, or synset. Examples of a noun, verb, and adjective synset are { vacation, holiday }, { close, shut }, and { soiled, dirty }, respectively. Each synset represents a concept, and each member of a synset encodes the same concept. Differently put, synset members are interchangeable in many contexts without changing the truth value of the context. Each synset also includes a definition, or ‘gloss’, and an illustrative sentence. The current version of WordNet (3.0) contains over 117,000 synsets that are organized into a huge semantic network. The synsets are interlinked by means of bidirectional semantic relations such as hyponymy, meronymy, and a number of entailment relations. For example, the relation between oak and tree is such that oak is encoded as a hyponym (subordinate) of tree and tree is encoded as a hypernym (superordinate) of oak. Leaf and trunk are meronyms (parts) of tree, their holonym. Meronyms are transitive, so linking leaf and trunk to tree means that oak (and beech and maple etc.) inherits leaf and trunk as parts by virtue of its relation to tree (Miller, 1990, 1998). Concepts expressed by other parts of speech (verbs, adjectives) are interlinked by means of additional relations (Fellbaum, 1998).