Techniques for automated taxonomy building: towards ontologies for knowledge management

Ontologies have become widely accepted as the main method for representing knowledge in Knowledge Management (KM) applica-tions. Given the continuous and rapid change and dynamic nature of knowledge in all fields, automated methods for construct-ing ontologies are of great importance. All ontologies or taxonomies currently in use have been hand built and require consider-able manpower to keep up to date. Taxono-mies are less logically rigorous than ontolo-gies, and in this paper we consider the re-quirements for a system which automatically constructed taxonomies. There are a number of potentially useful methods for construct-ing hierarchically organised concepts from a collection of texts and there are a number of automatic methods which permit one to as-sociate one word with another. The impor-tant issue for the successful development of this research area is to identify techniques for labelling the relation between two candi-date terms, if one exists. We consider a number of possible approaches and argue that the majority are unsuitable for our re-quirements.

[1]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[2]  Nicola Guarino,et al.  Identity, Unity, and Individuality: Towards a Formal Toolkit for Ontological Analysis , 2000, ECAI.

[3]  Ian Horrocks,et al.  OIL: An Ontology Infrastructure for the Semantic Web , 2001, IEEE Intell. Syst..

[4]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[5]  L MercerRobert,et al.  Class-based n-gram models of natural language , 1992 .

[6]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.

[7]  Marc Eisenstadt,et al.  THE KNOWLEDGE WEB , 1998 .

[8]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[9]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[10]  Francis Jack Smith,et al.  Improving Statistical Language Model Performance with Automatically Generated Word Hierarchies , 1995, Comput. Linguistics.

[11]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[12]  Emmanuel Morin,et al.  Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods , 1999 .

[13]  K. A. Ericsson,et al.  Protocol Analysis: Verbal Reports as Data , 1984 .

[14]  Steffen Staab,et al.  Deriving Concept Hierarchies from Text by Smooth Formal Concept Analysis , 2003 .

[15]  T. Buzan,et al.  The Mind Map Book , 1993 .

[16]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[17]  Alicia Perez,et al.  Evaluation of Taxonomic Knowledge in Ontologies and Knowledge Bases , 1999 .

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[20]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[21]  Asunción Gómez-Pérez Evaluation of Taxonomic Knowledge in Ontologies and Knowledge Bases , 1999 .

[22]  Fabio Ciravegna,et al.  Adaptive Information Extraction from Text by Rule Induction and Generalisation , 2001, IJCAI.

[23]  Robert Krovetz,et al.  More than One Sense Per Discourse , 1998 .

[24]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.