To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

In hybrid human-AI systems, users need to decide whether or not to trust an algorithmic prediction while the true error in the prediction is unknown. To accommodate such settings, we introduce RETRO-VIZ, a method for (i) estimating and (ii) explaining trustworthiness of regression predictions. It consists of RETRO, a quantitative estimate of the trustworthiness of a prediction, and VIZ, a visual explanation that helps users identify the reasons for the (lack of) trustworthiness of a prediction. We find that RETRO-scores negatively correlate with prediction error across 117 experimental settings, indicating that RETRO provides a useful measure to distinguish trustworthy predictions from untrustworthy ones. In a user study with 41 participants, we find that VIZ-explanations help users identify whether a prediction is trustworthy or not: on average, 95.1% of participants correctly select the more trustworthy prediction, given a pair of predictions. In addition, an average of 75.6% of participants can accurately describe why a prediction seems to be (not) trustworthy. Finally, we find that the vast majority of users subjectively experience RETRO-VIZ as a useful tool to assess the trustworthiness of algorithmic predictions.

[1]  Maya R. Gupta,et al.  To Trust Or Not To Trust A Classifier , 2018, NeurIPS.

[2]  Mark T. Keane,et al.  Twin-Systems to Explain Artificial Neural Networks using Case-Based Reasoning: Comparative Tests of Feature-Weighting Methods in ANN-CBR Twins for XAI , 2019, IJCAI.

[3]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[4]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[5]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[6]  M. de Rijke,et al.  Why does my model fail?: contrastive local explanations for retail forecasting , 2019, FAT*.

[7]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[8]  Eamonn J. Keogh Nearest Neighbor , 2010, Encyclopedia of Machine Learning.

[9]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[10]  Kori Inkpen Quinn,et al.  Investigating Human + Machine Complementarity for Recidivism Predictions , 2018, ArXiv.

[11]  Patrick D. McDaniel,et al.  Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[12]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[13]  David S. Melnick,et al.  International evaluation of an AI system for breast cancer screening , 2020, Nature.

[14]  The Role of Explanation in Algorithmic Trust ∗ Finale Doshi-Velez Ryan Budish Mason Kortz , 2017 .

[15]  Himabindu Lakkaraju,et al.  "How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations , 2019, AIES.

[16]  Gary Klein,et al.  Metrics for Explainable AI: Challenges and Prospects , 2018, ArXiv.

[17]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[18]  John D. Lee,et al.  Trust in Automation: Designing for Appropriate Reliance , 2004, Hum. Factors.

[19]  Eric Horvitz,et al.  Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff , 2019, AAAI.

[20]  Ehsan Toreini,et al.  The relationship between trust in AI and trustworthy machine learning technologies , 2019, FAT*.

[21]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[22]  Samuel A. Oluwadare,et al.  Credit card fraud detection using machine learning techniques: A comparative analysis , 2017, 2017 International Conference on Computing Networking and Informatics (ICCNI).

[23]  Sameer Singh,et al.  Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods , 2020, AIES.

[24]  Eric Horvitz,et al.  Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure , 2018, HCOMP.

[25]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[26]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[27]  Ankur Taly,et al.  Explainable machine learning in deployment , 2020, FAT*.

[28]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[29]  Jos'e Miguel Hern'andez-Lobato,et al.  Getting a CLUE: A Method for Explaining Uncertainty Estimates , 2020, ICLR.

[30]  Will LeVine,et al.  Accurate Layerwise Interpretable Competence Estimation , 2019, NeurIPS.

[31]  Ece Kamar,et al.  Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence , 2016, IJCAI.

[32]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[33]  M. C. Elish,et al.  Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction , 2019, Engaging Science, Technology, and Society.

[34]  Cynthia Rudin,et al.  Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions , 2017, AAAI.

[35]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[36]  Chenhao Tan,et al.  Harnessing Explanations to Bridge AI and Humans , 2020, ArXiv.

[37]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.