Towards Robust and Reliable Algorithmic Recourse

As predictive models are increasingly being deployed in high-stakes decision making (e.g., loan approvals), there has been growing interest in posthoc techniques which provide recourse to affected individuals. These techniques generate recourses under the assumption that the underlying predictive model does not change. However, in practice, models are often regularly updated for a variety of reasons (e.g., dataset shifts), thereby rendering previously prescribed recourses ineffective. To address this problem, we propose a novel framework, RObust Algorithmic Recourse (ROAR), that leverages adversarial training for finding recourses that are robust to model shifts. To the best of our knowledge, this work proposes the first solution to this critical problem. We also carry out detailed theoretical analysis which underscores the importance of constructing recourses that are robust to model shifts: 1) we derive a lower bound on the probability of invalidation of recourses generated by existing approaches which are not robust to model shifts. 2) we prove that the additional cost incurred due to the robust recourses output by our framework is bounded. Experimental evaluation on multiple synthetic and real-world datasets demonstrates the efficacy of the proposed framework and supports our theoretical findings.

[1]  Solon Barocas,et al.  The hidden assumptions behind counterfactual explanations and principal reasons , 2019, FAT*.

[2]  Peter A. Flach,et al.  FACE: Feasible and Actionable Counterfactual Explanations , 2020, AIES.

[3]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[4]  Amy E. Mickel,et al.  “Should This Loan be Approved or Denied?”: A Large Dataset with Class Assignment Guidelines , 2018 .

[5]  R. Handel Probability in High Dimension , 2014 .

[6]  Laurence B. Milstein,et al.  Chernoff-Type Bounds for the Gaussian Error Function , 2011, IEEE Transactions on Communications.

[7]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[8]  Paulo Cortez,et al.  Using data mining to predict secondary school student performance , 2008 .

[9]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[10]  Amit Dhurandhar,et al.  Model Agnostic Contrastive Explanations for Structured Data , 2019, ArXiv.

[11]  Himabindu Lakkaraju,et al.  Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses , 2020, NeurIPS.

[12]  Bernhard Schölkopf,et al.  Algorithmic Recourse: from Counterfactual Explanations to Interventions , 2020, FAccT.

[13]  Yang Liu,et al.  Actionable Recourse in Linear Classification , 2018, FAT.

[14]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[15]  Paul Voigt,et al.  The EU General Data Protection Regulation (GDPR) , 2017 .

[16]  Bernhard Schölkopf,et al.  A survey of algorithmic recourse: definitions, formulations, solutions, and prospects , 2020, ArXiv.

[17]  Stephan Günnemann,et al.  Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift , 2018, NeurIPS.

[18]  Mark Alfano,et al.  The philosophical basis of algorithmic recourse , 2020, FAT*.

[19]  Debdeep Mukhopadhyay,et al.  Adversarial Attacks and Defences: A Survey , 2018, ArXiv.

[20]  David Zeitlin On a Class of Definite Integrals , 1968 .

[21]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[22]  Amir-Hossein Karimi,et al.  Model-Agnostic Counterfactual Explanations for Consequential Decisions , 2019, AISTATS.

[23]  John P. Dickerson,et al.  Counterfactual Explanations for Machine Learning: A Review , 2020, ArXiv.

[24]  Amit Sharma,et al.  Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers , 2019, ArXiv.

[25]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[26]  Gjergji Kasneci,et al.  On Counterfactual Explanations under Predictive Multiplicity , 2020, UAI.

[27]  Janis Klaise,et al.  Interpretable Counterfactual Explanations Guided by Prototypes , 2019, ECML/PKDD.

[28]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[29]  Himabindu Lakkaraju,et al.  Robust and Stable Black Box Explanations , 2020, ICML.

[30]  Bernhard Schölkopf,et al.  Algorithmic recourse under imperfect causal knowledge: a probabilistic approach , 2020, NeurIPS.