The Shapley Value of Tuples in Query Answering

We investigate the application of the Shapley value to quantifying the contribution of a tuple to a query answer. The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single wealth-distribution measure that satisfies some natural axioms. While this value has been investigated in several areas, it received little attention in data management. We study this measure in the context of conjunctive and aggregate queries by defining corresponding coalition games. We provide algorithmic and complexity-theoretic results on the computation of Shapley-based contributions to query answers; and for the hard cases we present approximation algorithms.

[1]  R. Aumann,et al.  Endogenous Formation of Links Between Players and of Coalitions: An Application of the Shapley Value , 2003 .

[2]  Werner Nutt,et al.  Deciding equivalences among conjunctive aggregate queries , 2007, JACM.

[3]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB journal.

[4]  L. Shapley,et al.  The Shapley Value , 1994 .

[5]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[6]  Dan Suciu,et al.  The dichotomy of probabilistic inference for unions of conjunctive queries , 2012, JACM.

[7]  John Grant,et al.  Measuring inconsistency in knowledgebases , 2006, Journal of Intelligent Information Systems.

[8]  Babak Salimi,et al.  From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back , 2014, Theory of Computing Systems.

[9]  Martin Shubik,et al.  A Method for Evaluating the Distribution of Power in a Committee System , 1954, American Political Science Review.

[10]  Dennis Leech,et al.  Power indices and probabilistic voting assumptions , 1990 .

[11]  Y. Narahari,et al.  A Shapley Value-Based Approach to Discover Influential Nodes in Social Networks , 2011, IEEE Transactions on Automation Science and Engineering.

[12]  Francesco Scarcello,et al.  Structural Tractability of Shapley and Banzhaf Values in Allocation Games , 2015, IJCAI.

[13]  Dan Suciu,et al.  Causality in Databases , 2010, IEEE Data Eng. Bull..

[14]  Joseph Y. Halpern,et al.  Causes and Explanations: A Structural-Model Approach. Part I: Causes , 2000, The British Journal for the Philosophy of Science.

[15]  Vishal Misra,et al.  Internet Economics: The Use of Shapley Value for ISP Settlement , 2007, IEEE/ACM Transactions on Networking.

[16]  Dan Suciu,et al.  WHY SO? or WHY NO? Functional Causality for Explaining Query Answers , 2009, MUD.

[17]  Haris Aziz,et al.  Shapley Meets Shapley , 2013, STACS.

[18]  Anthony Hunter,et al.  On the measure of conflicts: Shapley Inconsistency Values , 2010, Artif. Intell..

[19]  Christopher Ré,et al.  Probabilistic databases: diamonds in the dirt , 2009, CACM.

[20]  Pradeep Dubey,et al.  Mathematical Properties of the Banzhaf Power Index , 1979, Math. Oper. Res..

[21]  Vincent Conitzer,et al.  Computing Shapley Values, Manipulating Value Division Schemes, and Checking Core Membership in Multi-Issue Domains , 2004, AAAI.

[22]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[23]  Roland Bacher,et al.  Determinants of matrices related to the Pascal triangle , 2002 .

[24]  G. Zaccour,et al.  Time-consistent Shapley value allocation of pollution cost reduction , 1999 .

[25]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[26]  Leopoldo E. Bertossi,et al.  Causes for query answers from databases: Datalog abduction, view-updates, and integrity constraints , 2016, Int. J. Approx. Reason..

[27]  Faruk Gul Bargaining Foundations of Shapley Value , 1989 .

[28]  Joseph Y. Halpern,et al.  Actual Causality , 2016, A Logical Theory of Causality.

[29]  L. Shapley A Value for n-person Games , 1988 .

[30]  Madalina Croitoru,et al.  Inconsistency Measures for Repair Semantics in OBDA , 2018, IJCAI.

[31]  Werner Kirsch,et al.  Power indices and minimal winning coalitions , 2008, Soc. Choice Welf..

[32]  Guy Van den Broeck,et al.  Quantifying Causal Effects on Query Answering in Databases , 2016, TaPP.

[33]  Seinosuke Toda,et al.  PP is as Hard as the Polynomial-Time Hierarchy , 1991, SIAM J. Comput..

[34]  Zhenliang Liao,et al.  Case study on initial allocation of Shanghai carbon emission trading based on Shapley value , 2015 .

[35]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[36]  Tatiana Nenova,et al.  The value of corporate voting rights and control: A cross-country analysis , 2003 .