论文信息 - Actionable Feature Discovery in Counterfactuals using Feature Relevance Explainers

Actionable Feature Discovery in Counterfactuals using Feature Relevance Explainers

Counterfactual explanations focus on “actionable knowledge” to help end-users understand how a Machine Learning model outcome could be changed to a more desirable outcome. For this purpose a counterfactual explainer needs to be able to reason with similarity knowledge in order to discover input dependencies that relate to outcome changes. Identifying the minimum subset of feature changes to action a change in the decision is an interesting challenge for counterfactual explainers. In this paper we show how feature relevance based explainers (i.e. LIME, SHAP), can inform a counterfactual explainer to identify the minimum subset of “actionable features”. We demonstrate our DisCERN (Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods) algorithm on three datasets and compare against the widely used counterfactual approach DiCE. Our preliminary results show that DisCERN to be a viable strategy that should be adopted to minimise the actionable changes.

[1] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[2] Barry Smyth,et al. Good Counterfactuals and Where to Find Them: A Case-Based Technique for Generating Counterfactuals for Explainable AI (XAI) , 2020, ICCBR.

[3] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[4] Ivan Koychev,et al. Feature Selection and Generalisation for Retrieval of Textual Cases , 2004, ECCBR.

[5] Chris Russell,et al. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[6] P. Harris,et al. Children's use of counterfactual thinking in causal reasoning , 1996, Cognition.

[7] David W. Aha,et al. A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[8] David Martens,et al. NICE: An Algorithm for Nearest Instance Counterfactual Explanations , 2021, ArXiv.

[9] Mark T. Keane,et al. Twin-Systems to Explain Artificial Neural Networks using Case-Based Reasoning: Comparative Tests of Feature-Weighting Methods in ANN-CBR Twins for XAI , 2019, IJCAI.

[10] Jonathan Timmis,et al. A Comment on Opt-AiNET: An Immune Network Algorithm for Optimisation , 2004, GECCO.

[11] Chenhao Tan,et al. Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End , 2021, AIES.

[12] Amit Sharma,et al. Explaining machine learning classifiers through diverse counterfactual explanations , 2020, FAT*.

[13] Eric D. Ragan,et al. A Survey of Evaluation Methods and Measures for Interpretable Machine Learning , 2018, ArXiv.

[14] N. Roese. Counterfactual thinking. , 1997, Psychological bulletin.

[15] L. Shapley. A Value for n-person Games , 1988 .