Automating RDF Dataset Transformation and Enrichment

With the adoption of RDF across several domains, come growing requirements pertaining to the completeness and quality of RDF datasets. Currently, this problem is most commonly addressed by manually devising means of enriching an input dataset. The few tools that aim at supporting this endeavour usually focus on supporting the manual definition of enrichment pipelines. In this paper, we present a supervised learning approach based on a refinement operator for enriching RDF datasets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against eight manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples.

[1]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[2]  John G. Breslin,et al.  Enrichment and Ranking of the YouTube Tag Space and Integration with the Linked Data Cloud , 2009, SEMWEB.

[3]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[4]  Enrico Motta,et al.  Evaluating question answering over linked data , 2013, J. Web Semant..

[5]  Christian Bizer,et al.  The R2R Framework: Publishing and Discovering Mappings on the Web , 2010, COLD.

[6]  Axel-Cyrille Ngonga Ngomo,et al.  SCMS - Semantifying Content Management Systems , 2011, SEMWEB.

[7]  Robert Isele,et al.  Learning linkage rules using genetic programming , 2011, OM.

[8]  Jens Lehmann,et al.  Introduction to Linked Data and Its Lifecycle on the Web , 2013, Reasoning Web.

[9]  Axel-Cyrille Ngonga Ngomo,et al.  Unsupervised learning of link specifications: deterministic vs. non-deterministic , 2013, OM.

[10]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[11]  Enrico Motta,et al.  Overcoming Schema Heterogeneity between Linked Semantic Repositories to Improve Coreference Resolution , 2009, ASWC.

[12]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[13]  Robert Isele,et al.  LDIF - Linked Data Integration Framework , 2011, COLD.

[14]  Hugh Glaser,et al.  Consuming Multiple Linked Data Sources: Challenges and Experiences , 2010, COLD.

[15]  Axel Polleres,et al.  Rapid prototyping of semantic mash-ups through semantic web pipes , 2009, WWW '09.

[16]  Jens Lehmann,et al.  Concept learning in description logics using refinement operators , 2009, Machine Learning.

[17]  Edward Curry,et al.  Toward Situation Awareness for the Semantic Sensor Web: Complex Event Processing with Dynamic Linked Data Enrichment , 2011, SSN.

[18]  Daniela Giordano,et al.  Interlinking educational resources and the web of data: A survey of challenges and approaches , 2013, Program.

[19]  Axel-Cyrille Ngonga Ngomo,et al.  On Link Discovery using a Hybrid Approach , 2012, Journal on Data Semantics.

[20]  Axel-Cyrille Ngonga Ngomo,et al.  Ensemble Learning for Named Entity Recognition , 2014, SEMWEB.

[21]  Jens Lehmann,et al.  Pattern Based Knowledge Base Enrichment , 2013, SEMWEB.

[22]  Qi Gao,et al.  Semantic Enrichment of Twitter Posts for User Profile Construction on the Social Web , 2011, ESWC.

[23]  Jason J. Jung,et al.  Retracted: Semantic Information Integration with Linked Data Mashups Approaches , 2015 .

[24]  Jens Lehmann,et al.  Introduction to Linked Data and Its Lifecycle on the Web , 2013, Reasoning Web.