SEMCON: A Semantic and Contextual Objective Metric for Enriching Domain Ontology Concepts

This paper presents a novel concept enrichment objective metric combining contextual and semantic information of terms extracted from the domain documents. The proposed metric is called SEMCON which stands for semantic and contextual objective metric. It employs a hybrid learning approach utilizing functionalities from statistical and linguistic ontology learning techniques. The metric also introduced for the first time two statistical features that have shown to improve the overall score ranking of highly relevant terms for concept enrichment. Subjective and objective experiments are conducted in various domains. Experimental results F1 from computer domain show that SEMCON achieved better performance in contrast to tf*idf, and LSA methods, with 12.2%, 21.8%, and 24.5% improvement over them respectively. Additionally, an investigation into how much each of contextual and semantic components contributes to the overall task of concept enrichment is conducted and the obtained results suggest that a balanced weight gives the best performance.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Philipp Cimiano,et al.  Ontology learning and population from text - algorithms, evaluation and applications , 2006 .

[3]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[4]  Samhaa R. El-Beltagy,et al.  A Survey of Ontology Learning Approaches , 2011 .

[5]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[6]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[7]  Gérard Dray,et al.  Towards an Automatic Characterization of Criteria , 2011, DEXA.

[8]  Martin Romacker,et al.  The SynDiKATe Text Knowledge Base Generator , 2001, HLT.

[9]  Takahira Yamaguchi Acquiring Conceptual Relationships from Domain-Specific Texts , 2001, Workshop on Ontology Learning.

[10]  Gerard Salton,et al.  Automatic text decomposition using text segments and text themes , 1996, HYPERTEXT '96.

[11]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[12]  Carl Gutwin,et al.  Seeing things in the clouds: the effect of visual features on tag cloud selections , 2008, Hypertext.

[13]  Mehrnoush Shamsfard,et al.  Learning ontologies from natural language texts , 2004, Int. J. Hum. Comput. Stud..

[14]  Deborah L. McGuinness,et al.  Conceptual Modeling for Distributed Ontology Environments , 2000, ICCS.

[15]  Sule Yildirim Yayilgan,et al.  Building Domain Ontologies for Hyperlinked Multimedia Pedagogical Platforms , 2014, HCI.

[16]  Mohameth-François Sy,et al.  How Ontology Based Information Retrieval Systems May Benefit from Lexical Text Analysis , 2013, New Trends of Research in Ontologies and Lexical Resources.

[17]  James Nga-Kwok Liu,et al.  A New Method for Knowledge and Information Management Domain Ontology Graph Model , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[18]  Christian Wolff,et al.  Learning Relations Using Collocations , 2001, Workshop on Ontology Learning.

[19]  Steffen Staab,et al.  The TEXT-TO-ONTO Ontology Learning Environment , 2000 .

[20]  Lucas Drumond,et al.  A Survey of Ontology Learning Procedures , 2008, WONTO.

[21]  Martin Halvey,et al.  An assessment of tag presentation techniques , 2007, WWW '07.

[22]  Ralf Steinmetz,et al.  An evaluation framework for ontology enrichment , 2005 .

[23]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[24]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[25]  Steffen Staab,et al.  Ontology Learning Part One - On Discoverying Taxonomic Relations from the Web , 2002 .

[26]  H. Young Optimal Voting Rules , 1995 .

[27]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.