Linked Data Quality

The wides pread of semantic web technologies such as RDF, SPARQL and OWL enables individuals to build their databases on the web, write vocabularies, and define rules to arrange and explain the relationships between data according to the Linked Data principles. As a consequence, a large amount of structured and interlinked data is being generated daily. A close examination of the quality of this data could be very critical, especially if important researches and professional decisions depend on it. Several linked data quality metrics have been proposed, and they cover numerous dimensions of linked data quality such as completeness, consistency, conciseness and interlinking. In this work, we are interested in linked data quality dimensions, especially the completeness and conciseness of linked datasets. A set of experiments were conducted on a real-world dataset (DBpedia) to evaluate our proposed approaches.

[1]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[2]  Werner Nutt,et al.  Recoin: Relative Completeness in Wikidata , 2018, WWW.

[3]  Carlo Batini,et al.  Methodologies for data quality assessment and improvement , 2009, CSUR.

[4]  Pierre-Henri Paris,et al.  Assessing the Completeness Evolution of DBpedia: A Case Study , 2017, ER Workshops.

[5]  Felix Naumann,et al.  Synonym Analysis for Predicate Expansion , 2013, ESWC.

[6]  Christian Bizer,et al.  Sieve: linked data quality assessment and fusion , 2012, EDBT-ICDT '12.

[7]  Christoph Lange,et al.  Luzzu -- A Framework for Linked Data Quality Assessment , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[8]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[9]  Andreas Harth,et al.  Weaving the Pedantic Web , 2010, LDOW.

[10]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[11]  Richard Y. Wang,et al.  Data quality assessment , 2002, CACM.

[12]  Jens Lehmann,et al.  Test-driven evaluation of linked data quality , 2014, WWW.

[13]  Harald Sack,et al.  DBpedia ontology enrichment for inconsistency detection , 2012, I-SEMANTICS '12.

[14]  Martin Hepp,et al.  Swiqa - a semantic web information quality assessment framework , 2011, ECIS.

[15]  Álvaro Sicilia,et al.  Semantic web journal , 2017 .

[16]  Jens Lehmann,et al.  Assessing Linked Data Mappings Using Network Measures , 2012, ESWC.

[17]  Nilson Arrais Quality control handbook , 1966 .

[18]  Maria-Esther Vidal,et al.  Analyzing Linked Data Quality with LiQuate , 2013, OTM Workshops.