Decisions, Counterfactual Explanations and Strategic Behavior

As data-driven predictive models are increasingly used to inform decisions, it has been argued that decision makers should provide explanations that help individuals understand what would have to change for these decisions to be beneficial ones. However, there has been little discussion on the possibility that individuals may use the above counterfactual explanations to invest effort strategically and maximize their chances of receiving a beneficial decision. In this paper, our goal is to find policies and counterfactual explanations that are optimal in terms of utility in such a strategic setting. We first show that, given a pre-defined policy, the problem of finding the optimal set of counterfactual explanations is NP-hard. Then, we show that the corresponding objective is nondecreasing and satisfies submodularity and this allows a standard greedy algorithm to enjoy approximation guarantees. In addition, we further show that the problem of jointly finding both the optimal policy and set of counterfactual explanations reduces to maximizing a non-monotone submodular function. As a result, we can use a recent randomized algorithm to solve the problem, which also offers approximation guarantees. Finally, we demonstrate that, by incorporating a matroid constraint into the problem formulation, we can increase the diversity of the optimal set of counterfactual explanations and incentivize individuals across the whole spectrum of the population to self improve. Experiments on synthetic and real lending and credit card data illustrate our theoretical findings and show that the counterfactual explanations and decision policies found by our algorithms achieve higher utility than several competitive baselines.

[1]  Anca D. Dragan,et al.  The Social Cost of Strategic Classification , 2018, FAT.

[2]  Yang Liu,et al.  Actionable Recourse in Linear Classification , 2018, FAT.

[3]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[4]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[5]  Roberto Solis-Oba,et al.  Approximation Algorithms for the k-Median Problem , 2006, Efficient Approximation and Online Algorithms.

[6]  Adrian Weller,et al.  Challenges for Transparency , 2017, ArXiv.

[7]  Joseph Naor,et al.  Submodular Maximization with Cardinality Constraints , 2014, SODA.

[8]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[9]  Bernhard Schölkopf,et al.  Optimal Decision Making Under Strategic Behavior , 2019, ArXiv.

[10]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[11]  Paul Voigt,et al.  The EU General Data Protection Regulation (GDPR) , 2017 .

[12]  Jure Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[13]  Nicole Immorlica,et al.  The Disparate Effects of Strategic Manipulation , 2018, FAT.

[14]  Amit Sharma,et al.  Explaining machine learning classifiers through diverse counterfactual explanations , 2020, FAT*.

[15]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[16]  Aaron Roth,et al.  Strategic Classification from Revealed Preferences , 2017, EC.

[17]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[18]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[19]  Yiling Chen,et al.  A Short-term Intervention for Long-term Fairness in the Labor Market , 2017, WWW.

[20]  David W. Aha,et al.  DARPA's Explainable Artificial Intelligence (XAI) Program , 2019, AI Mag..

[21]  Amir-Hossein Karimi,et al.  Model-Agnostic Counterfactual Explanations for Consequential Decisions , 2019, AISTATS.

[22]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[23]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[24]  Moritz Hardt,et al.  Strategic Adaptation to Classifiers: A Causal Perspective , 2019, ArXiv.

[25]  Christos H. Papadimitriou,et al.  Strategic Classification , 2015, ITCS.

[26]  D. Hochbaum,et al.  Analysis of the greedy approach in problems of maximum k‐coverage , 1998 .

[27]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[28]  Esther Rolf,et al.  Delayed Impact of Fair Machine Learning , 2018, ICML.

[29]  Solon Barocas,et al.  The hidden assumptions behind counterfactual explanations and principal reasons , 2019, FAT*.

[30]  Krikamol Muandet,et al.  Fair Decisions Despite Imperfect Predictions , 2019, AISTATS.

[31]  Jon M. Kleinberg,et al.  How Do Classifiers Induce Agents to Invest Effort Strategically? , 2018, EC.

[32]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[33]  Solon Barocas,et al.  Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[34]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[35]  Stephen Coate,et al.  Will Affirmative-Action Policies Eliminate Negative Stereotypes? , 1993 .

[36]  Alun D. Preece,et al.  Interpretability of deep learning models: A survey of results , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[37]  Chris Russell,et al.  Efficient Search for Diverse Coherent Explanations , 2019, FAT.

[38]  Tobias Scheffer,et al.  Stackelberg games for adversarial prediction problems , 2011, KDD.

[39]  Fabrizio Silvestri,et al.  Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking , 2017, KDD.

[40]  Zachary C. Lipton,et al.  The mythos of model interpretability , 2018, Commun. ACM.

[41]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[42]  Moritz Hardt,et al.  Strategic Classification is Causal Modeling in Disguise , 2019, ICML.

[43]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[44]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[45]  Adish Singla,et al.  Enhancing the Accuracy and Fairness of Human Decision Making , 2018, NeurIPS.