A framework for linked data fusion and quality assessment

The growth of semantic web technologies underpins the ever-increasing development of linked data and their applications. In recent years, the number of linked data sources has been raised from 12 to more than 2973 sets. The datasets are managed as decentralized sources, and their quality is a serious concern. The assessment of the quality of linked data is a key to adopting them in different fields because each data set has been developed by a different group, using various methods and tools. Moreover, crowd sourcing contributes as one of the main strategies in data collection. This contribution is seen in the tourism industry or E-commerce fields and deserves attention. The qualitative and quantitative diversity of such data is higher than those generated by official organizations and firms. In this paper, we first overview and evaluate the dimensions and measures for the quality assessment of data. Then, we present a novel framework as a solution for improving linked data quality evaluation and data fusion. Finally, we adopt several tools to assess the quality of data of some reputable data sources using the proposed framework.

[1]  Jürgen Umbrich,et al.  An empirical survey of Linked Data conformance , 2012, J. Web Semant..

[2]  Witold Pedrycz,et al.  Granular Computing: Perspectives and Challenges , 2013, IEEE Transactions on Cybernetics.

[3]  Ashwin Machanavajjhala,et al.  Network sampling , 2013, KDD.

[4]  Jens Lehmann,et al.  Assessing Linked Data Mappings Using Network Measures , 2012, ESWC.

[5]  Andriy Nikolov,et al.  Detecting Quality Problems in Semantic Metadata without the Presence of a Gold Standard , 2007, EON.

[6]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[7]  Vladimir Zadorozhny,et al.  Conflict-Aware Historical Data Fusion , 2011, SUM.

[8]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[9]  Christian Bizer,et al.  Sieve: linked data quality assessment and fusion , 2012, EDBT-ICDT '12.

[10]  Jens Lehmann,et al.  Test-driven evaluation of linked data quality , 2014, WWW.

[11]  Robert Isele,et al.  Interlinking and Knowledge Fusion , 2014, Linked Open Data.

[12]  Anisa Rula,et al.  Methodology for Assessment of Linked Data Quality , 2014, LDQ@SEMANTICS.

[13]  Robert Isele,et al.  LDIF - Linked Data Integration Framework , 2011, COLD.

[14]  Andreas Harth,et al.  Weaving the Pedantic Web , 2010, LDOW.

[15]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[16]  Christian Bizer,et al.  Quality-driven information filtering using the WIQA policy framework , 2009, J. Web Semant..

[17]  Tom Ziemke,et al.  On the Definition of Information Fusion as a Field of Research , 2007 .