Linking Lixicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology

Abstract: Ontologies are becoming extremely useful tools for sophisticated software engineering. Designing applications, databases, and knowledge bases with reference to a common ontology can mean shorter development cycles, easier and faster integration with other software and content, and a more scalable product. Although ontologies are a very promising solution to some of the most pressing problems that confront software engineering, they also raise some issues and difficulties of their own. Consider, for example, the questions below: • How can a formal ontology be used effectively by those who lack extensive training in logic and mathematics? • How can an ontology be used automatically by applications (e.g. Information Retrieval and Natural Language Processing applications) that process free text? • How can we know when an ontology is complete? In this paper we will begin by describing the upperlevel ontology SUMO (Suggested Upper Merged Ontology), which has been proposed as the initial version of an eventual Standard Upper Ontology (SUO). We will then describe the popular, free, and structured WordNet lexical database. After this preliminary discussion, we will describe the methodology that we are using to align WordNet with the SUMO. We close this paper by discussing how this alignment of WordNet with SUMO will provide answers to the questions posed above. Ontologies are becoming extremely useful tools for sophisticated software engineering. Designing applications, databases, and knowledge bases with reference to a common ontology can mean shorter development cycles, easier and faster integration with other software and content, and a more scalable product. Although ontologies are a very promising solution to some of the most pressing problems that confront software engineering, they also raise some issues and difficulties of their own. Consider, for example, the questions below: • How can a formal ontology be used effectively by those who lack extensive training in logic and mathematics? • How can an ontology be used automatically by applications (e.g. Information Retrieval and Natural Language Processing applications) that process free text? • How can we know when an ontology is complete? In this paper we will begin by describing the upperlevel ontology SUMO (Suggested Upper Merged Ontology), which has been proposed as the initial version of an eventual Standard Upper Ontology (SUO). We will then describe the popular, free, and structured WordNet lexical database. After this preliminary discussion, we will describe the methodology that we are using to align WordNet with the SUMO. We close this paper by discussing how this alignment of WordNet with SUMO will provide answers to the questions posed above. keywords: natural language, ontology 1. SUMO The SUMO (Suggested Upper Merged Ontology) is an ontology that was created at Teknowledge Corporation with extensive input from the SUO mailing list, and it has been proposed as a starter document for the IEEE-sanctioned SUO Working Group [1]. The SUMO was created by merging publicly available ontological content into a single, comprehensive, and cohesive structure [2,3]. As of February 2003, the ontology contains 1000 terms and 4000 assertions. The ontology can be browsed online (http://ontology.teknowledge.com), and source files for all of the versions of the ontology can be freely downloaded (http://ontology.teknowledge.com/cgibin/cvsweb.cgi/SUO/).