Refining non-taxonomic relation labels with external structured data to support ontology learning

This paper presents a method to integrate external knowledge sources such as DBpedia and OpenCyc into an ontology learning system that automatically suggests labels for unknown relations in domain ontologies based on large corpora of unstructured text. The method extracts and aggregates verb vectors from semantic relations identified in the corpus. It composes a knowledge base which consists of (i) verb centroids for known relations between domain concepts, (ii) mappings between concept pairs and the types of known relations, and (iii) ontological knowledge retrieved from external sources. Applying semantic inference and validation to this knowledge base improves the quality of suggested relation labels. A formal evaluation compares the accuracy and average ranking precision of this hybrid method with the performance of methods that solely rely on corpus data and those that are only based on reasoning and external data sources.

[1]  Nathalie Aussenac-Gilles,et al.  An interactive pattern based approach for extracting non-taxonomic relations from texts , 2008 .

[2]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[3]  Preslav Nakov,et al.  SemEval-2007 Task 04: Classification of Semantic Relations between Nominals , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[4]  Patrick Pantel,et al.  VerbOcean: Mining the Web for Fine-Grained Semantic Verb Relations , 2004, EMNLP.

[5]  P. Buitelaar,et al.  Ontology Learning and Population: Bridging the Gap between Text and Knowledge - Volume 167 Frontiers in Artificial Intelligence and Applications , 2008 .

[6]  Sanda M. Harabagiu,et al.  UTD-HLT-CG: Semantic Architecture for Metonymy Resolution and Classification of Nominal Relations , 2007, SemEval@ACL.

[7]  Amal Zouaq,et al.  Enhancing Learning Objects with an Ontology-Based Memory , 2009, IEEE Transactions on Knowledge and Data Engineering.

[8]  Noriaki Izumi,et al.  DODDLE-OWL: Interactive Domain Ontology Development with Open Source Software in Java , 2008, IEICE Trans. Inf. Syst..

[9]  Philipp Cimiano,et al.  Automatic Acquisition of Ranked Qualia Structures from the Web , 2007, ACL.

[10]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[11]  David Sánchez,et al.  Learning non-taxonomic relationships from web documents for domain ontology construction , 2008, Data Knowl. Eng..

[12]  Arno Scharl,et al.  Discovery and evaluation of non-taxonomic relations in domain ontologies , 2009, Int. J. Metadata Semant. Ontologies.

[13]  Enrico Motta,et al.  Toward a New Generation of Semantic Web Applications , 2008, IEEE Intelligent Systems.

[14]  Martin Kavalec,et al.  A Study on Automated Relation Labelling in Ontology Learning , 2005 .

[15]  Massimo Poesio,et al.  Identifying Concept Attributes Using a Classifier , 2005, ACL 2005.

[16]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[17]  Dekang Lin,et al.  Review of WordNet: an electronic lexical database by Christiane Fellbaum. The MIT Press 1998. , 1999 .

[18]  Walter Daelemans,et al.  Automatic Initiation of an Ontology , 2004, CoopIS/DOA/ODBASE.

[19]  Oren Etzioni,et al.  The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[20]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[21]  Patrick Pantel,et al.  Automatically Harvesting and Ontologizing Semantic Relations , 2008, Ontology Learning and Population.

[22]  Thomas L. Griffiths,et al.  Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.

[23]  Timothy Baldwin,et al.  Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition , 2005, ACL 2005.

[24]  Paola Velardi,et al.  Enriching a Formal Ontology with a Thesaurus: an Application in the Cultural Heritage Domain , 2006, OntologyLearning@COLING/ACL.

[25]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[26]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[27]  Claudio Giuliano,et al.  FBK-IRST: Kernel Methods for Semantic Relation Extraction , 2007, SemEval@ACL.

[28]  Claudio Giuliano,et al.  FBK-IRST: Kernel Methods for Semantic Relation Extraction , 2007, *SEMEVAL.

[29]  Michael L. Littman,et al.  Corpus-based Learning of Analogies and Semantic Relations , 2005, Machine Learning.

[30]  Mohammed Bennamoun,et al.  Acquiring Semantic Relations Using the Web for Constructing Lightweight Ontologies , 2009, PAKDD.

[31]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[32]  Fabio Rinaldi,et al.  An environment for relation mining over richly annotated corpora: the case of GENIA , 2006, BMC Bioinformatics.

[33]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[34]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[35]  Paul Buitelaar,et al.  Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge , 2006, OntologyLearning@COLING/ACL.

[36]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[37]  Mirella Lapata,et al.  Proceedings of EMNLP 2004 , 2004 .

[38]  Aldo Gangemi,et al.  Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology , 2005, IJCAI.

[39]  Steffen Staab,et al.  International Handbooks on Information Systems , 2013 .

[40]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[41]  Grace Hui Yang,et al.  A Metric-based Framework for Automatic Taxonomy Induction , 2009, ACL.

[42]  Philipp Cimiano,et al.  Ontology learning and population from text - algorithms, evaluation and applications , 2006 .

[43]  Paola Velardi,et al.  Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites , 2004, CL.

[44]  Elizabeth Chang,et al.  Semi-Automatic Ontology Extension Using Spreading Activation , 2005 .

[45]  Ming Zhou,et al.  Identifying Synonyms among Distributionally Similar Words , 2003, IJCAI.

[46]  J. Cullen,et al.  The Knowledge Acquisition Bottleneck: Time for Reassessment? , 1988 .

[47]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[48]  Aldo Gangemi,et al.  Unsupervised Learning of Semantic Relations for Molecular Biology Ontologies , 2008, Ontology Learning and Population.

[49]  Maria Ruiz-Casado,et al.  Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia , 2007, Data Knowl. Eng..

[50]  Patrick Pantel,et al.  Ontologizing Semantic Relations , 2006, ACL.

[51]  Paul Buitelaar,et al.  RelExt: A Tool for Relation Extraction from Text in Ontology Extension , 2005, SEMWEB.

[52]  Steffen Staab,et al.  Ontology Learning Part One - On Discoverying Taxonomic Relations from the Web , 2002 .

[53]  Jens Lehmann,et al.  Discovering Unknown Connections - the DBpedia Relationship Finder , 2007, CSSW.