RDF-Hunter: Automatically Crowdsourcing the Execution of Queries Against RDF Data Sets

In the last years, a large number of RDF data sets has become available on the Web. However, due to the semi-structured nature of RDF data, missing values affect answer completeness of queries that are posed against this data. To overcome this limitation, we propose RDF-Hunter, a novel hybrid query processing approach that brings together machine and human computation to execute queries against RDF data. We develop a novel quality model and query engine in order to enable RDF-Hunter to on the fly decide which parts of a query should be executed through conventional technology or crowd computing. To evaluate RDF-Hunter, we created a collection of 50 SPARQL queries against the DBpedia data set, executed them using our hybrid query engine, and analyzed the accuracy of the outcomes obtained from the crowd. The experiments clearly show that the overall approach is feasible and produces query results that reliably and significantly enhance completeness of automatic query processing responses.

[1]  David R. Karger,et al.  Human-powered Sorts and Joins , 2011, Proc. VLDB Endow..

[2]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[3]  Heiko Paulheim,et al.  Adoption of the Linked Data Best Practices in Different Topical Domains , 2014, SEMWEB.

[4]  Jennifer Widom,et al.  Query Processing over Crowdsourced Data , 2012 .

[5]  Jennifer Widom,et al.  Query Optimization over Crowdsourced Data , 2013, Proc. VLDB Endow..

[6]  David R. Karger,et al.  Counting with the Crowd , 2012, Proc. VLDB Endow..

[7]  Jennifer Widom,et al.  Deco: A System for Declarative Crowdsourcing , 2012, Proc. VLDB Endow..

[8]  Tova Milo,et al.  OASSIS: query driven crowd mining , 2014, SIGMOD Conference.

[9]  Purnamrita Sarkar,et al.  Crowdsourced enumeration queries , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[10]  Maribel Acosta,et al.  Crowdsourcing Linked Data Quality Assessment , 2013, SEMWEB.