Assessing the Quality of Domain Concepts Descriptions in DBpedia

With the increasing volume of datasets on the Linked Open Data (LOD) cloud, it becomes necessary to assess Linked Data quality. This is especially important for DBpedia, which has become a prominent resource on the LOD. In this paper, our aim is to evaluate the quality of the description of domain concepts in DBpedia. Using a data-driven approach on a sample of domain concepts from Wikipedia, we show that a) the resources in our sample are described mainly by facts in DBpedia and seldom refer to the DBpedia ontology, b) DBpedia models very poorly these sample domain concepts at the instance level and schema level, c) very few predicates can be used for inference purposes, and d) very few domain predicates (object properties) are used in the description of domain concepts. This highlights the importance of restructuring the DBpedia knowledge base and including domain knowledge at the schema and instance levels.

[1]  Christian Bizer,et al.  Sieve: linked data quality assessment and fusion , 2012, EDBT-ICDT '12.

[2]  Philipp Cimiano,et al.  Ontology learning and population from text - algorithms, evaluation and applications , 2006 .

[3]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[4]  Heiko Paulheim,et al.  Type Inference on Noisy RDF Data , 2013, SEMWEB.

[5]  Claudio Giuliano,et al.  Extending the Coverage of DBpedia Properties using Distant Supervision over Wikipedia , 2013, NLP-DBPEDIA@ISWC.

[6]  Mohammed Bennamoun,et al.  Ontology learning from text: A look back and into the future , 2012, CSUR.

[7]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[8]  Alexandre Passant,et al.  Measuring Semantic Distance on Linking Data and Using it for Resources Recommendations , 2010, AAAI Spring Symposium: Linked Data Meets Artificial Intelligence.

[9]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[10]  Christoph Lange,et al.  Luzzu Quality Metric Language - A DSL for Linked Data Quality Assessment , 2015, ArXiv.

[11]  Jens Lehmann,et al.  User-driven quality evaluation of DBpedia , 2013, I-SEMANTICS '13.

[12]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[13]  James A. Hendler,et al.  N3Logic: A logical framework for the World Wide Web , 2007, Theory and Practice of Logic Programming.