Document Ontology Discovery Tool

Concept taxonomy is an integral part of a classification system, however, it is not sufficient for instantiating a knowledge base to train an automated classifier. The complete knowledge base requires an ontology, a set of instances, and functions that instantiate concepts and relations. Accordingly, the document classifier, a system for classifying concepts (topics) contained in text documents, may be trained from a properly organized collection of topic descriptions. Given a repository of diverse documents, the problem is how to translate human objectives into an organized set of exemplars. In this paper, we describe a document ontology discovery tool for constructing knowledge bases. The user interacts with a repository indexed by latent semantic indexing and models the ontology presented as a hierarchy of topics. Automatically generated concept taxonomies produce results not only comparable with other clustering techniques but in addition, provide meaningful topic descriptions necessary for human-computer interaction in ontology modeling.