LANCE: A Generic Benchmark Generator for Linked Data

Identifying duplicate instances in the Data Web is most commonly performed (semi-)automatically using instance matching frameworks. However, current instance matching benchmarks fail to provide end users and developers with the necessary insights pertaining to how current frameworks behave when dealing with real data. In this demo paper, we present Lance, a domain-independent instance matching benchmark generator for Linked Data. Lance is the first benchmark generator for Linked Data to support semantics-aware test cases that take into account complex OWL constructs in addition to the standard test cases related to structure and value transformations. Lance supports the definition of matching tasks with varying degrees of difficulty and produces a weighted gold standard, which allows a more fine-grained analysis of the performance of instance matching tools. It can accept as input any linked dataset and its accompanying schema to produce a target dataset implementing test cases of varying levels of difficulty. In this demo, we will present the benchmark generation process underlying Lance as well as the user interface designed to support Lance users.