Generative Local Interpretable Model-Agnostic Explanations

The use of AI and machine learning models in the industry is rapidly increasing. Because of this growth and the noticeable performance of these models, more mission-critical decision-making intelligent systems have been developed. Despite their success, when used for decision-making, AI solutions have a significant drawback: transparency. The lack of transparency behind their behaviors, particularly in complex state-of-the-art machine learning algorithms, leaves users with little understanding of how these models make specific decisions. To address this issue, algorithms such as LIME and SHAP (Kernel SHAP) have been introduced. These algorithms aim to explain AI models by generating data samples around an intended test instance by perturbing the various features. This process has the drawback of potentially generating invalid data points outside of the data domain. In this paper, we aim to improve LIME and SHAP by using a pre-trained Variational AutoEncoder (VAE) on the training dataset to generate realistic data around the test instance. We also employ a sensitivity feature importance with Boltzmann distribution to aid in explaining the behavior of the black-box model surrounding the intended test instance.

[1]  Mohammad Nagahisarchoghaei,et al.  An Empirical Survey on Explainable AI Technologies: Recent Trends, Use-Cases, and Categories from Technical and Application Perspectives , 2023, Electronics.

[2]  C. Luo,et al.  A Multidimensional Game Theory-Based Group Decision Model for Predictive Analytics , 2022, Comput. Math. Methods.

[3]  Seyedhamidreza Shahabi Haghighi,et al.  Deep learning for Alzheimer's disease diagnosis: A survey , 2022, Artif. Intell. Medicine.

[4]  S. Rahimi,et al.  A Two-Dimensional Model for Game Theory Based Predictive Analytics , 2021, 2021 International Conference on Computational Science and Computational Intelligence (CSCI).

[5]  P. Pérez,et al.  Explainability of Deep Vision-Based Autonomous Driving Systems: Review and Challenges , 2021, International Journal of Computer Vision.

[6]  Himabindu Lakkaraju,et al.  Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods , 2019, AIES.

[7]  Plamen Angelov,et al.  Fair-by-design explainable models for prediction of recidivism , 2019, ArXiv.

[8]  Tommi S. Jaakkola,et al.  On the Robustness of Interpretability Methods , 2018, ArXiv.

[9]  Aaron J. Fisher,et al.  All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously , 2018, J. Mach. Learn. Res..

[10]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[11]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[12]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[13]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.