Ranking RDF with Provenance via Preference Aggregation

Information retrieval on RDF data benefits greatly from additional provenance information attached to the individual pieces of information. Provenance information such as origin of data, certainty, and temporal information on RDF statements can be used to rank search results according to one of those dimensions. In this paper, we consider the problem of aggregating provenance information from different dimensions in order to obtain a joint ranking over all dimensions. We relate this to the problem of preference aggregation in social choice theory and translate different solutions for preference aggregation to the problem of aggregating provenance rankings. By exploiting the ranking orderings on the provenance dimensions, we characterize three different approaches for aggregating preferences, namely the lexicographical rule, the Borda rule and the plurality rule, in our framework of provenance aggregation.

[1]  Werner Kießling,et al.  Foundations of Preferences in Database Systems , 2002, VLDB.

[2]  Egor V. Kostylev,et al.  Combining dependent annotations for relational algebra , 2012, ICDT '12.

[3]  John R. Smith,et al.  Supporting Incremental Join Queries on Ranked Inputs , 2001, VLDB.

[4]  Jerry S. Kelly,et al.  Social Choice Theory: An Introduction , 1988 .

[5]  Kevin Chen-Chuan Chang,et al.  RankSQL: query algebra and optimization for relational top-k queries , 2005, SIGMOD '05.

[6]  Umberto Straccia,et al.  A General Framework for Representing and Reasoning with Annotated Semantic Web Data , 2010, AAAI.

[7]  Andre Bolles,et al.  Streaming SPARQL - Extending SPARQL to Process Data Streams , 2008, ESWC.

[8]  Mary Rouncefield,et al.  Condorcet's Paradox , 1989 .

[9]  Daniele Braga,et al.  An execution environment for C-SPARQL queries , 2010, EDBT '10.

[10]  Steffen Staab,et al.  Querying for provenance, trust, uncertainty and other meta knowledge in RDF , 2009, J. Web Semant..

[11]  Egor V. Kostylev Annotation algebras for RDFS , 2010 .

[12]  Ian Horrocks,et al.  The Semantic Web – ISWC 2010: 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I , 2010, SEMWEB.

[13]  Walid G. Aref,et al.  Rank-aware query optimization , 2004, SIGMOD '04.

[14]  Umberto Straccia,et al.  AnQL: SPARQLing Up Annotated RDFS , 2010, SEMWEB.

[15]  Jiawei Han,et al.  Answering top-k queries with multi-dimensional selections: the ranking cube approach , 2006, VLDB.

[16]  Norbert Fuhr,et al.  A Probabilistic Framework for Vague Queries and Imprecise Information in Databases , 1990, VLDB.

[17]  K. Arrow A Difficulty in the Concept of Social Welfare , 1950, Journal of Political Economy.

[18]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[19]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[20]  Ihab F. Ilyas,et al.  A survey of top-k query processing techniques in relational database systems , 2008, CSUR.

[21]  J. Kelly Social Choice Theory: An Introduction , 1988 .

[22]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[23]  Umberto Straccia,et al.  A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data , 2011, J. Web Semant..

[24]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[25]  Luis Gravano,et al.  Top-k selection queries over relational databases: Mapping strategies and performance evaluation , 2002, TODS.

[26]  Aristides Gionis,et al.  Automated Ranking of Database Query Results , 2003, CIDR.

[27]  Daniele Braga,et al.  Incremental Reasoning on Streams and Rich Background Knowledge , 2010, ESWC.

[28]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .