CounterNet: End-to-End Training of Counterfactual Aware Predictions

This work presents CounterNet, a novel end-to-end learning framework which integrates the predictive model training and counterfactual (CF) explanation generation into a single end-to-end pipeline. Counterfactual explanations attempt to find the smallest modification to the feature values of an instance that changes the prediction of the ML model to a predefined output. Prior CF explanation techniques rely on solving separate time-intensive optimization problems for every single input instance to find CF examples, and also suffer from the misalignment of objectives between model predictions and explanations, which leads to significant shortcomings in the quality of CF explanations. CounterNet, on the other hand, integrates both prediction and explanation in the same framework, which enables the optimization of the CF example generation only once together with the predictive model. We propose a novel variant of back-propagation which can help in effectively training CounterNet’s network. Finally, we conduct extensive experiments on multiple real-world datasets. Our results show that CounterNet generates high-quality predictions, and corresponding CF examples (with high validity) for any new input instance significantly faster than existing state-of-the-art baselines.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[3]  Learning Model-Agnostic Counterfactual Explanations for Tabular Data , 2019, WWW.

[4]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[5]  Martin Hlosta,et al.  Open University Learning Analytics dataset , 2017, Scientific data.

[6]  Jette Henderson,et al.  CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models , 2020, AIES.

[7]  Amit Sharma,et al.  Explaining machine learning classifiers through diverse counterfactual explanations , 2020, FAT*.

[8]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[9]  Yang Liu,et al.  Actionable Recourse in Linear Classification , 2019, FAT.

[10]  John Dickerson,et al.  Counterfactual Explanations for Machine Learning: A Review , 2020, ArXiv.

[11]  Jun Zhao,et al.  'It's Reducing a Human Being to a Percentage': Perceptions of Justice in Algorithmic Decisions , 2018, CHI.

[12]  Cynthia Rudin Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018 .

[13]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[14]  Amit Sharma,et al.  Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers , 2019, ArXiv.

[15]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[16]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, HLT-NAACL Demos.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.

[20]  Ankur Taly,et al.  Explainable machine learning in deployment , 2020, FAT*.

[21]  Janis Klaise,et al.  Interpretable Counterfactual Explanations Guided by Prototypes , 2019, ArXiv.

[22]  Cynthia Rudin,et al.  This looks like that: deep learning for interpretable image recognition , 2018, NeurIPS.

[23]  Paulo Cortez,et al.  Using data mining to predict secondary school student performance , 2008 .

[24]  Bernhard Schölkopf,et al.  Algorithmic Recourse: from Counterfactual Explanations to Interventions , 2021, FAccT.

[25]  Trade-Offs between Fairness and Interpretability in Machine Learning , 2020 .

[26]  Johannes Gehrke,et al.  Accurate intelligible models with pairwise interactions , 2013, KDD.

[27]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.