Quality Metrics for Linked Open Data

The vision of the Linked Open Data LOD initiative is to provide a model for publishing data and meaningfully interlinking such dispersed but related data. Despite the importance of data quality for the successful growth of the LOD, only limited attention has been focused on quality of data prior to their publication on the LOD. This paper focuses on the systematic assessment of the quality of datasets prior to publication on the LOD cloud. To this end, we identify important quality deficiencies that need to be avoided and/or resolved prior to the publication of a dataset. We then propose a set of metrics to measure and identify these quality deficiencies in a dataset. This way, we enable the assessment and identification of undesirable quality characteristics of a dataset through our proposed metrics.

[1]  Andreas Harth,et al.  Weaving the Pedantic Web , 2010, LDOW.

[2]  Richard Y. Wang,et al.  Data quality assessment , 2002, CACM.

[3]  Christian Bizer,et al.  Quality-driven information filtering using the WIQA policy framework , 2009, J. Web Semant..

[4]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[5]  Mohsen Kahani,et al.  Publishing Data of Ferdowsi University of Mashhad as Linked Data , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[6]  Mohsen Kahani,et al.  A Metrics-Driven Approach for Quality Assessment of Linked Open Data , 2014, J. Theor. Appl. Electron. Commer. Res..

[7]  Andriy Nikolov,et al.  Detecting Quality Problems in Semantic Metadata without the Presence of a Gold Standard , 2007, EON.

[8]  Martin J. Eppler,et al.  Conceptualizing Information Quality: A Review of Information Quality Frameworks from the Last Ten Years , 2000, IQ.

[9]  Stefan Brüggemann,et al.  Using Ontologies Providing Domain Knowledge for Data Quality Management , 2009, Networked Knowledge - Networked Media - Integrating Knowledge Management.

[10]  Mohsen Kahani,et al.  Publishing Persian linked data; challenges and lessons learned , 2010, 2010 5th International Symposium on Telecommunications.

[11]  Janusz Kacprzyk,et al.  Networked Knowledge - Networked Media , 2009 .

[12]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[13]  Peralta Costabel,et al.  Data freshness and data accuracy :a state of the art , 2006 .

[14]  Martin Hepp,et al.  Using Semantic Web Resources for Data Quality Management , 2010, EKAW.

[15]  Dragan Gasevic,et al.  Assessing the maintainability of software product line feature models using structural metrics , 2011, Software Quality Journal.