Automatically Enriching Domain Ontologies for Document Classification

The ontology-based document classification approach relies on the content meanings of a given domain exploited and captured using the ontologies of this particular domain. Domain ontologies consist of a set of concepts and relations which links these concepts. However, they often do not provide an in-depth coverage of concepts thereby limiting their use in some subdomain applications. Therefore, the techniques for enhancing ontologies, particularly ontology enrichment, have emerged as an essential prerequisite for ontology-based applications. In this paper, we propose a new objective metric called SEMCON to enrich the domain ontology with new terms. To achieve this, SEMCON combines semantic as well as contextual information of terms within the text documents. Experiments are conducted to demonstrate the applicability of the proposed model and the obtained results from the funding domain show that document classification achieved better performance using the enriched ontology in contrast to using the baseline ontology.