Identifying duplicate instances in the Data Web is most commonly performed (semi-)automatically using instance matching frameworks. However, current instance matching benchmarks fail to provide end users and developers with the necessary insights pertaining to how current frameworks behave when dealing with real data. In this demo paper, we present Lance, a domain-independent instance matching benchmark generator for Linked Data. Lance is the first benchmark generator for Linked Data to support semantics-aware test cases that take into account complex OWL constructs in addition to the standard test cases related to structure and value transformations. Lance supports the definition of matching tasks with varying degrees of difficulty and produces a weighted gold standard, which allows a more fine-grained analysis of the performance of instance matching tools. It can accept as input any linked dataset and its accompanying schema to produce a target dataset implementing test cases of varying levels of difficulty. In this demo, we will present the benchmark generation process underlying Lance as well as the user interface designed to support Lance users.
[1]
Stefan Conrad,et al.
A Benchmark for Testing Instance-based Ontology Matching Methods
,
2010,
EKAW.
[2]
Wang Chiew Tan,et al.
STBenchmark: towards a benchmark for mapping systems
,
2008,
Proc. VLDB Endow..
[3]
Sören Auer,et al.
LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data
,
2011,
IJCAI.
[4]
Axel-Cyrille Ngonga Ngomo,et al.
Pushing the Limits of Instance Matching Systems: A Semantics-Aware Benchmark for Linked Data
,
2015,
WWW.
[5]
Axel-Cyrille Ngonga Ngomo,et al.
LANCE: Piercing to the Heart of Instance Matching Tools
,
2015,
SEMWEB.
[6]
Vasilis Efthymiou,et al.
Entity resolution in the web of data
,
2013,
Entity Resolution in the Web of Data.
[7]
Robert Isele,et al.
Silk Server - Adding missing Links while consuming Linked Data
,
2010,
COLD.