Machine Learning Explainability for External Stakeholders

As machine learning is increasingly deployed in high-stakes contexts affecting people's livelihoods, there have been growing calls to open the black box and to make machine learning algorithms more explainable. Providing useful explanations requires careful consideration of the needs of stakeholders, including end-users, regulators, and domain experts. Despite this need, little work has been done to facilitate inter-stakeholder conversation around explainable machine learning. To help address this gap, we conducted a closed-door, day-long workshop between academics, industry experts, legal scholars, and policymakers to develop a shared language around explainability and to understand the current shortcomings of and potential solutions for deploying explainable machine learning in service of transparency goals. We also asked participants to share case studies in deploying explainable machine learning at scale. In this paper, we provide a short summary of various case studies of explainable machine learning, lessons from those studies, and discuss open challenges.

[1]  Neil D. Lawrence Data Science and Digital Systems: The 3Ds of Machine Learning Systems Design , 2019, ArXiv.

[2]  Mohit Bansal,et al.  Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? , 2020, ACL.

[3]  Adrian Weller,et al.  Evaluating and Aggregating Feature-based Model Explanations , 2020, IJCAI.

[4]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[5]  Chris Russell,et al.  Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[6]  Thomas Koch,et al.  When Debunking Scientific Myths Fails (and When It Does Not) , 2016 .

[7]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[8]  Sameer Singh,et al.  Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods , 2020, AIES.

[9]  Jessica Hullman,et al.  Why Authors Don't Visualize Uncertainty , 2019, IEEE Transactions on Visualization and Computer Graphics.

[10]  Amit Dhurandhar,et al.  Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives , 2018, NeurIPS.

[11]  Daniel G. Goldstein,et al.  Manipulating and Measuring Model Interpretability , 2018, CHI.

[12]  Kartikeya Bhardwaj,et al.  On Network Science and Mutual Information for Explaining Deep Neural Networks , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Krzysztof Z. Gajos,et al.  Proxy tasks and subjective measures can be misleading in evaluating explainable AI systems , 2020, IUI.

[14]  Eric Horvitz,et al.  Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance , 2019, HCOMP.

[15]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[16]  David Danks,et al.  Different "Intelligibility" for Different Folks , 2020, AIES.

[17]  Doug Downey,et al.  Explanation-Based Tuning of Opaque Machine Learners with Application to Paper Recommendation , 2020, ArXiv.

[18]  Danah Boyd,et al.  Fairness and Abstraction in Sociotechnical Systems , 2019, FAT.

[19]  Matthieu Cord,et al.  Addressing Failure Prediction by Learning Model Confidence , 2019, NeurIPS.

[20]  D. Spiegelhalter Risk and Uncertainty Communication , 2017 .

[21]  Eric Horvitz,et al.  Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff , 2019, AAAI.

[22]  Ankur Taly,et al.  Explainable machine learning in deployment , 2019, FAT*.

[23]  Amit Dhurandhar,et al.  One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques , 2019, ArXiv.

[24]  Sunita Sarawagi,et al.  Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings , 2018, ICML.

[25]  Jos'e Miguel Hern'andez-Lobato,et al.  Getting a CLUE: A Method for Explaining Uncertainty Estimates , 2020, ICLR.

[26]  Adrian Weller,et al.  You Shouldn't Trust Me: Learning Models Which Conceal Unfairness From Multiple Explanation Methods , 2020, SafeAI@AAAI.

[27]  Adrian Weller,et al.  Transparency: Motivations and Challenges , 2019, Explainable AI.

[28]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[29]  Oluwasanmi Koyejo,et al.  Interpreting Black Box Predictions using Fisher Kernels , 2018, AISTATS.

[30]  Stefano Ermon,et al.  Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.

[31]  Pradeep Ravikumar,et al.  Representer Point Selection for Explaining Deep Neural Networks , 2018, NeurIPS.

[32]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[33]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[34]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[35]  Yang Liu,et al.  Actionable Recourse in Linear Classification , 2018, FAT.

[36]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[37]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[38]  Kush R. Varshney,et al.  Increasing Trust in AI Services through Supplier's Declarations of Conformity , 2018, IBM J. Res. Dev..

[39]  Anca D. Dragan,et al.  The Social Cost of Strategic Classification , 2018, FAT.

[40]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[41]  Yunfeng Zhang,et al.  Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making , 2020, FAT*.

[42]  Harmanpreet Kaur,et al.  Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning , 2020, CHI.

[43]  Inioluwa Deborah Raji,et al.  ABOUT ML: Annotation and Benchmarking on Understanding and Transparency of Machine Learning Lifecycles , 2019, ArXiv.

[44]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[45]  M. Strathern ‘Improving ratings’: audit in the British University system , 1997, European Review.

[46]  Paul N. Bennett,et al.  Guidelines for Human-AI Interaction , 2019, CHI.

[47]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.