AI Explainability 360 Toolkit

As machine learning algorithms make inroads into our lives and society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. Moreover, these stakeholders, whether they be government regulators, affected citizens, domain experts, or developers, present different requirements for explanations. To address these needs, we introduce AI Explainability 3601, an open-source software toolkit featuring eight diverse state-of-the-art explainability methods, two evaluation metrics, and an extensible software architecture that organizes these methods according to their use in the AI modeling pipeline. Additionally, we have implemented enhancements to bring research innovations closer to consumers of explanations, ranging from simplified, accessible versions of algorithms to guidance material to help users navigate the space of explanations along with tutorials and an interactive web demo to introduce AI explainability to practitioners. Together, our toolkit can help improve transparency of machine learning models and provides a platform to integrate new explainability techniques as they are developed.

[1]  Bilal Alsallakh,et al.  Captum: A unified and generic model interpretability library for PyTorch , 2020, ArXiv.

[2]  Amit Dhurandhar,et al.  AI Explainability 360: An Extensible Toolkit for Understanding Data and Machine Learning Models , 2020, J. Mach. Learn. Res..

[3]  Amit Dhurandhar,et al.  TED: Teaching AI to Explain its Decisions , 2018, AIES.

[4]  Przemyslaw Biecek,et al.  DALEX: Explainers for Complex Predictive Models in R , 2018, J. Mach. Learn. Res..

[5]  Sanjeeb Dash,et al.  Boolean Decision Rules via Column Generation , 2018, NeurIPS.

[6]  Sanjeeb Dash,et al.  Generalized Linear Rule Models , 2019, ICML.

[7]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[8]  Amit Dhurandhar,et al.  Generating Contrastive Explanations with Monotonic Attribute Functions , 2019, ArXiv.

[9]  Talel Abdessalem,et al.  Scikit-Multiflow: A Multi-output Streaming Framework , 2018, J. Mach. Learn. Res..

[10]  Martin J. Wainwright,et al.  Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems , 2018, AISTATS.

[11]  Charu C. Aggarwal,et al.  Efficient Data Representation by Selecting Prototypes with Importance Weights , 2017, 2019 IEEE International Conference on Data Mining (ICDM).

[12]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[13]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.

[14]  Amit Dhurandhar,et al.  Improving Simple Models with Confidence Profiles , 2018, NeurIPS.

[15]  Bernd Bischl,et al.  iml: An R package for Interpretable Machine Learning , 2018, J. Open Source Softw..

[16]  Rich Caruana,et al.  InterpretML: A Unified Framework for Machine Learning Interpretability , 2019, ArXiv.

[17]  Martin Wattenberg,et al.  The What-If Tool: Interactive Probing of Machine Learning Models , 2019, IEEE Transactions on Visualization and Computer Graphics.