QALM: a Benchmark for Question Answering over Linked Merchant Websites Data

This paper presents a benchmark for training and evaluating Question Answering Systems aiming at mediating between a user, expressing his or her information needs in natural language, and semantic data in the commercial domain of the mobile phones industry. We first describe the RDF dataset we extracted through the APIs of merchant websites, and the schemas on which it relies. We then present the methodology we applied to create a set of natural language questions expressing possible user needs in the above mentioned domain. Such question set has then been further annotated both with the corresponding SPARQL queries, and with the correct answers retrieved from the dataset.