Enhancing Dataset Quality using Keys
暂无分享,去创建一个
The Linked Data principles provide a decentral approach for publishing structured data in RDF on the Web. A consequence of this architectural choice is a high variance in the quality of the RDF datasets which constitute the Linked Data cloud. In this demo paper, we address a particular aspect of quality, i.e., the discriminability of resources. During our demo, we will present our simple three-step approach and interface, which allows data publishers to detect the resources in their dataset that are indistinguishable with respect to a given set of properties. Our approach is highly scalable as it relies on ROCKER, a novel algorithm for key discovery. Our evaluation on DBpedia suggests that even very commonly-used data sources are still in need to significant improvement to abide by the discriminability criterion.
[1] Jens Lehmann,et al. Test-driven evaluation of linked data quality , 2014, WWW.
[2] Axel-Cyrille Ngonga Ngomo,et al. ROCKER: A Refinement Operator for Key Discovery , 2015, WWW.
[3] Jérôme David,et al. Keys and Pseudo-Keys Detection for Web Datasets Cleansing and Interlinking , 2012, EKAW.
[4] Jens Lehmann,et al. Quality assessment for Linked Data: A Survey , 2015, Semantic Web.
[5] Nathalie Pernelle,et al. SAKey: Scalable Almost Key Discovery in RDF Data , 2014, SEMWEB.