AI Explainability 360: Impact and Design

As artificial intelligence and machine learning algorithms become increasingly prevalent in society, multiple stakeholders are calling for these algorithms to provide explanations. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, have different explanation needs. To address these needs, in 2019, we created AI Explainability 360 (Arya et al. 2020), an open source software toolkit featuring ten diverse and state-of-the-art explainability methods and two evaluation metrics. This paper examines the impact of the toolkit with several case studies, statistics, and community feedback. The different ways in which users have experienced AI Explainability 360 have resulted in multiple types of impact and improvements in multiple metrics, highlighted by the adoption of the toolkit by the independent LF AI & Data Foundation. The paper also describes the flexible design of the toolkit, examples of its use, and the significant educational material and documentation available to its users.

[1]  Alun D. Preece,et al.  Interpretable to Whom? A Role-based Model for Analyzing Interpretable Machine Learning Systems , 2018, ArXiv.

[2]  Charu C. Aggarwal,et al.  Efficient Data Representation by Selecting Prototypes with Importance Weights , 2017, 2019 IEEE International Conference on Data Mining (ICDM).

[3]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[4]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[5]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[6]  Amit Dhurandhar,et al.  Leveraging Latent Features for Local Explanations , 2019, KDD.

[7]  Sanjeeb Dash,et al.  Boolean Decision Rules via Column Generation , 2018, NeurIPS.

[8]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[9]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[10]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[11]  Ziming Huang,et al.  On Sample Based Explanation Methods for NLP: Faithfulness, Efficiency and Semantic Evaluation , 2021, Annual Meeting of the Association for Computational Linguistics.

[12]  Amit Dhurandhar,et al.  Improving Simple Models with Confidence Profiles , 2018, NeurIPS.

[13]  Michael Hind,et al.  Explaining explainable AI , 2019, XRDS.

[14]  Julia Powles,et al.  "Meaningful Information" and the Right to Explanation , 2017, FAT.

[15]  Babak Salimi,et al.  Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals , 2021, SIGMOD Conference.

[16]  Amit Dhurandhar,et al.  One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques , 2019, ArXiv.

[17]  Amit Dhurandhar,et al.  TED: Teaching AI to Explain its Decisions , 2018, AIES.

[18]  Sanjeeb Dash,et al.  Generalized Linear Rule Models , 2019, ICML.

[19]  Seth Flaxman,et al.  EU regulations on algorithmic decision-making and a "right to explanation" , 2016, ArXiv.

[20]  Suhang Wang,et al.  GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Model's Prediction , 2020, KDD.

[21]  Jianbo Li,et al.  Outlier Impact Characterization for Time Series Data , 2021, AAAI.

[22]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.