Reliable Counterfactual Explanations for Autoencoder based Anomalies

Autoencoders have been used successfully for tackling the problem of anomaly detection in an unsupervised setting, and are often known to give better results than traditional approaches such as clustering and subspace-based (linear) methods. A data point is flagged as anomalous by an autoencoder if its reconstruction loss is higher than an appropriate threshold. However, as with other deep learning models, the increased accuracy offered by autoencoders comes at the cost of interpretability. Explaining an autoencoder’s decision to flag a particular data point as an anomaly is greatly important, since a human-friendly explanation would be necessary for a domain expert tasked with evaluating the model’s decisions. We consider the problem of finding counterfactual explanations for autoencoder anomalies, which address the question of what needs to be minimally changed in a given anomalous data point to make it non-anomalous. We present an algorithm that generates a diverse set of proximate counterfactual explanations for a given autoencoder anomaly. We also introduce the notion of reliability of a counterfactual, and present techniques to find reliable counterfactual explanations.

[1]  Yang Liu,et al.  Actionable Recourse in Linear Classification , 2018, FAT.

[2]  Amir-Hossein Karimi,et al.  Model-Agnostic Counterfactual Explanations for Consequential Decisions , 2019, AISTATS.

[3]  Thomas G. Dietterich,et al.  Sequential Feature Explanations for Anomaly Detection , 2019, ACM Trans. Knowl. Discov. Data.

[4]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[5]  Chris Russell,et al.  Efficient Search for Diverse Coherent Explanations , 2019, FAT.

[6]  Amit Sharma,et al.  Explaining machine learning classifiers through diverse counterfactual explanations , 2020, FAT*.

[7]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[8]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[9]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[10]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[11]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[12]  Randy C. Paffenroth,et al.  Anomaly Detection with Robust Deep Autoencoders , 2017, KDD.

[13]  Naoya Takeishi Shapley Values of Reconstruction Errors of PCA for Explaining Anomaly Detection , 2019, 2019 International Conference on Data Mining Workshops (ICDMW).

[14]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[15]  I. Davidson Anomaly Detection, Explanation and Visualization , 2022 .

[16]  Debdeep Mukhopadhyay,et al.  Adversarial Attacks and Defences: A Survey , 2018, ArXiv.

[17]  Lior Rokach,et al.  Explaining Anomalies Detected by Autoencoders Using SHAP , 2019, ArXiv.

[18]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[19]  Alberto Costa,et al.  RBFOpt: an open-source library for black-box optimization with costly function evaluations , 2018, Mathematical Programming Computation.