论文信息 - To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions - 字舞流文

To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

In hybrid human-AI systems, users need to decide whether or not to trust an algorithmic prediction while the true error in the prediction is unknown. To accommodate such settings, we introduce RETRO-VIZ, a method for (i) estimating and (ii) explaining trustworthiness of regression predictions. It consists of RETRO, a quantitative estimate of the trustworthiness of a prediction, and VIZ, a visual explanation that helps users identify the reasons for the (lack of) trustworthiness of a prediction. We find that RETRO-scores negatively correlate with prediction error across 117 experimental settings, indicating that RETRO provides a useful measure to distinguish trustworthy predictions from untrustworthy ones. In a user study with 41 participants, we find that VIZ-explanations help users identify whether a prediction is trustworthy or not: on average, 95.1% of participants correctly select the more trustworthy prediction, given a pair of predictions. In addition, an average of 75.6% of participants can accurately describe why a prediction seems to be (not) trustworthy. Finally, we find that the vast majority of users subjectively experience RETRO-VIZ as a useful tool to assess the trustworthiness of algorithmic predictions.

Ana Lucic | Hinda Haned | Kim de Bie | H. Haned | Ana Lucic | K. D. Bie

[1] Maya R. Gupta,et al. To Trust Or Not To Trust A Classifier , 2018, NeurIPS.

[2] Mark T. Keane,et al. Twin-Systems to Explain Artificial Neural Networks using Case-Based Reasoning: Comparative Tests of Feature-Weighting Methods in ANN-CBR Twins for XAI , 2019, IJCAI.

[3] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[4] Alfred Inselberg,et al. The plane with parallel coordinates , 1985, The Visual Computer.

[5] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[6] M. de Rijke,et al. Why does my model fail?: contrastive local explanations for retail forecasting , 2019, FAT*.

[7] Agnar Aamodt,et al. Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[8] Eamonn J. Keogh. Nearest Neighbor , 2010, Encyclopedia of Machine Learning.

[9] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[10] Kori Inkpen Quinn,et al. Investigating Human + Machine Complementarity for Recidivism Predictions , 2018, ArXiv.

[11] Patrick D. McDaniel,et al. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[12] Jonathan Goldstein,et al. When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[13] David S. Melnick,et al. International evaluation of an AI system for breast cancer screening , 2020, Nature.

[14] The Role of Explanation in Algorithmic Trust ∗ Finale Doshi-Velez Ryan Budish Mason Kortz , 2017 .

[15] Himabindu Lakkaraju,et al. "How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations , 2019, AIES.

[16] Gary Klein,et al. Metrics for Explainable AI: Challenges and Prospects , 2018, ArXiv.

[17] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[18] John D. Lee,et al. Trust in Automation: Designing for Appropriate Reliance , 2004, Hum. Factors.

[19] Eric Horvitz,et al. Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff , 2019, AAAI.

[20] Ehsan Toreini,et al. The relationship between trust in AI and trustworthy machine learning technologies , 2019, FAT*.

[21] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[22] Samuel A. Oluwadare,et al. Credit card fraud detection using machine learning techniques: A comparative analysis , 2017, 2017 International Conference on Computing Networking and Informatics (ICCNI).

[23] Sameer Singh,et al. Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods , 2020, AIES.

[24] Eric Horvitz,et al. Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure , 2018, HCOMP.

[25] Alex Kendall,et al. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[26] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[27] Ankur Taly,et al. Explainable machine learning in deployment , 2020, FAT*.

[28] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[29] Jos'e Miguel Hern'andez-Lobato,et al. Getting a CLUE: A Method for Explaining Uncertainty Estimates , 2020, ICLR.

[30] Will LeVine,et al. Accurate Layerwise Interpretable Competence Estimation , 2019, NeurIPS.

[31] Ece Kamar,et al. Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence , 2016, IJCAI.

[32] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[33] M. C. Elish,et al. Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction , 2019, Engaging Science, Technology, and Society.

[34] Cynthia Rudin,et al. Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions , 2017, AAAI.

[35] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[36] Chenhao Tan,et al. Harnessing Explanations to Bridge AI and Humans , 2020, ArXiv.

[37] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.