A metric Suite for Systematic Quality Assessment of Linked Open Data

The vision of the Linked Open Data (LOD) initiative is to provide a distributed model for publishing and meaningfully interlinking open data. The realization of this goal depends strongly on the quality of the data that is published as a part of the LOD. This paper focuses on the systematic quality assessment of datasets prior to publication on the LOD cloud. To this end, we identify important quality deficiencies that need to be avoided and/or resolved prior to the publication of a dataset. We then propose a set of metrics to measure these quality deficiencies in a dataset. This way, we enable the assessment and identification of undesirable quality characteristics of a dataset through our proposed metrics. This will help publishers to filter out low-quality data based on the quality assessment results, which in turn enables data consumers to make better and more informed decisions when using the open datasets.

[1]  Christian Bizer,et al.  Quality-driven information filtering using the WIQA policy framework , 2009, J. Web Semant..

[2]  Carlo Batini,et al.  A Framework And A Methodology For Data Quality Assessment And Monitoring , 2007, ICIQ.

[3]  Farookh Khadeer Hussain,et al.  A Framework for Measuring Ontology Usage on the Web , 2013, Comput. J..

[4]  Andriy Nikolov,et al.  Detecting Quality Problems in Semantic Metadata without the Presence of a Gold Standard , 2007, EON.

[5]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[6]  Olaf Hartig,et al.  Using Web Data Provenance for Quality Assessment , 2009, SWPM.

[7]  Dragan Gasevic,et al.  Assessing the maintainability of software product line feature models using structural metrics , 2011, Software Quality Journal.

[8]  Carlo Batini,et al.  Methodologies for data quality assessment and improvement , 2009, CSUR.

[9]  Mohsen Kahani,et al.  Publishing Persian linked data; challenges and lessons learned , 2010, 2010 5th International Symposium on Telecommunications.

[10]  Martin Hepp,et al.  Using Semantic Web Resources for Data Quality Management , 2010, EKAW.

[11]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[12]  Martin J. Eppler,et al.  Conceptualizing Information Quality: A Review of Information Quality Frameworks from the Last Ten Years , 2000, IQ.

[13]  Diane M. Strong,et al.  AIMQ: a methodology for information quality assessment , 2002, Inf. Manag..

[14]  Stefan Brüggemann,et al.  Using Ontologies Providing Domain Knowledge for Data Quality Management , 2009, Networked Knowledge - Networked Media - Integrating Knowledge Management.

[15]  Farookh Khadeer Hussain,et al.  Empirical analysis of domain ontology usage on the Web: eCommerce domain in focus , 2014, Concurr. Comput. Pract. Exp..

[16]  Andreas Harth,et al.  Weaving the Pedantic Web , 2010, LDOW.

[17]  Felix Naumann,et al.  Assessment Methods for Information Quality Criteria , 2000, IQ.

[18]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[19]  Carlo Batini,et al.  Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.

[20]  Peralta Costabel,et al.  Data freshness and data accuracy :a state of the art , 2006 .

[21]  Richard Y. Wang,et al.  Data Quality Assessment , 2002 .