ONTOCUBE: efficient ontology extraction using OLAP cubes

Ontologies are knowledge conceptualizations of a particular domain and are commonly represented with hierarchies. While final ontologies appear deceivingly simple on paper, building ontologies represents a time-consuming task that is normally performed by natural language processing techniques or schema matching. On the other hand, OLAP cubes are most commonly used during decision-making processes via the analysis of data summarizations. In this paper, we present a novel approach based on using OLAP cubes for ontology extraction. The resulting ontology is obtained through an analytical process of the summarized frequencies of keywords within a corpus. The solution was implemented within a relational database system (DBMS). In our experiments, we show how all the proposed discrimination measures (frequency, correlation, lift) affect the resulting classes. We also show a sample ontology result and the accuracy of finding true classes. Finally, we show the performance breakdown of our algorithm.

[1]  Paul Buitelaar,et al.  OntoLT: A Protg Plug-In for Ontology Extraction from Text , 2003 .

[2]  Noriaki Izumi,et al.  DODDLE-OWL: Interactive Domain Ontology Development with Open Source Software in Java , 2008, IEICE Trans. Inf. Syst..

[3]  Carlos Garcia-Alvarado,et al.  OLAP-based query recommendation , 2010, CIKM '10.

[4]  Avigdor Gal,et al.  OntoBuilder: fully automatic extraction and consolidation of ontologies from Web sources , 2004, Proceedings. 20th International Conference on Data Engineering.

[5]  Osman Hegazy,et al.  Applying data mining for ontology building , 2008 .

[6]  Carlos Ordonez,et al.  Efficient OLAP with UDFs , 2008, DOLAP '08.

[7]  Carlos Ordonez,et al.  Statistical Model Computation with UDFs , 2010, IEEE Transactions on Knowledge and Data Engineering.