Crowd-assessing quality in uncertain data linking datasets

The quality of a dataset used for evaluating data linking methods, techniques, and tools depends on the availability of a set of mappings, called reference alignment , that is known to be correct. In particular, it is crucial that mappings effectively represent relations between pairs of entities that are indeed similar due to the fact that they denote the same object. Since the reliability of mappings is decisive in order to perform a fair evaluation of automatic linking methods and tools, we call this property of mappings as mapping fairness . In this article, we propose a crowd-based approach, called Crowd Quality ( CQ ), for assessing the quality of data linking datasets by measuring the fairness of the mappings in the reference alignment. Moreover, we present a real experiment, where we evaluate two state-of-the-art data linking tools before and after the refinement of the reference alignment based on the CQ approach, in order to present the benefits deriving from the crowd assessment of mapping fairness.

[1]  Alessandro Bozzon,et al.  Reactive crowdsourcing , 2013, WWW.

[2]  Patrick Lambrix,et al.  User validation in ontology alignment: functional assessment and impact , 2019, Knowl. Eng. Rev..

[3]  Aynaz Taheri,et al.  Pay-As-You-Go Multi-user Feedback Model for Ontology Matching , 2014, EKAW.

[4]  Heiner Stuckenschmidt,et al.  Benchmarking Matching Applications on the Semantic Web , 2011, ESWC.

[5]  Silvana Castano,et al.  A Multi-dimensional Approach to Crowd-Consensus Modeling and Evaluation , 2015, ER.

[6]  Heiko Paulheim,et al.  Towards Evaluating Interactive Ontology Matching Tools , 2013, ESWC.

[7]  Alfio Ferrara,et al.  Consensus-Based Techniques for Range-Task Resolution in Crowdsourcing Systems , 2017, EDBT/ICDT Workshops.

[8]  Emanuel Santos,et al.  The AgreementMakerLight Ontology Matching System , 2013, OTM Conferences.

[9]  Mark A. Musen,et al.  Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow , 2013, WebSci.

[10]  Krzysztof Z. Gajos,et al.  Platemate: crowdsourcing nutritional analysis from food photographs , 2011, UIST.

[11]  Axel-Cyrille Ngonga Ngomo,et al.  Pushing the Limits of Instance Matching Systems: A Semantics-Aware Benchmark for Linked Data , 2015, WWW.

[12]  Ian Horrocks,et al.  Large-scale Interactive Ontology Matching: Algorithms and Implementation , 2012, ECAI.

[13]  Patrick Lambrix,et al.  User Validation in Ontology Alignment , 2016, SEMWEB.

[14]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[15]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative 2007 , 2006, OM.

[16]  Fernando González-Ladrón-de-Guevara,et al.  Towards an integrated crowdsourcing definition , 2012, J. Inf. Sci..

[17]  Bernardo Cuenca Grau,et al.  LogMap: Logic-Based and Scalable Ontology Matching , 2011, SEMWEB.

[18]  Elena Paslaru Bontas Simperl,et al.  SpotTheLink: A Game for Ontology Alignment , 2011, Wissensmanagement.

[19]  Silvana Castano,et al.  Combining crowd consensus and user trustworthiness for managing collective tasks , 2016, Future Gener. Comput. Syst..

[20]  Mark A. Musen,et al.  Crowdsourcing Ontology Verification , 2013, ICBO.

[21]  Jérôme Euzenat,et al.  Ontology matching benchmarks: Generation, stability, and discriminability , 2013, J. Web Semant..

[22]  Ian Horrocks,et al.  Exploiting the UMLS metathesaurus in the ontology alignment evaluation initiative , 2012, E-LKR.

[23]  Axel-Cyrille Ngonga Ngomo,et al.  HOBBIT link discovery benchmarks at ontology matching 2017 , 2017, OM@ISWC.

[24]  Pascal Hitzler,et al.  Conference v2.0: An Uncertain Version of the OAEI Conference Benchmark , 2014, SEMWEB.

[25]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[26]  F. Galton One Vote, One Value , 1907, Nature.

[27]  Ian Horrocks,et al.  Logic-based assessment of the compatibility of UMLS ontology sources , 2011, J. Biomed. Semant..

[28]  Elena Paslaru Bontas Simperl,et al.  CrowdMap: Crowdsourcing Ontology Alignment with Microtasks , 2012, SEMWEB.

[29]  Maribel Acosta,et al.  Crowdsourcing Linked Data Quality Assessment , 2013, SEMWEB.