Color for Characters - Effects of Visual Explanations of AI on Trust and Observability

The present study investigates the effects of prototypical visualization approaches aimed at increasing the explainability of machine learning systems in regard to perceived trustworthiness and observability. As the amount of processes automated by artificial intelligence (AI) increases, so does the need to investigate users’ perception. Previous research on explainable AI (XAI) tends to focus on technological optimization. The limited amount of empirical user research leaves key questions unanswered, such as which XAI designs actually improve perceived trustworthiness and observability. We assessed three different visual explanation approaches, consisting of either only a table with classification scores used for classification, or, additionally, one of two different backtraced visual explanations. In a within-subjects design with N = 83 we examined the effects on trust and observability in an online experiment. While observability benefitted from visual explanations, information-rich explanations also led to decreased trust. Explanations can support human-AI interaction, but differentiated effects on trust and observability have to be expected. The suitability of different explanatory approaches for individual AI applications should be further examined to ensure a high level of trust and observability in e.g. automated image processing.

[1]  J. Mauchly Significance Test for Sphericity of a Normal $n$-Variate Distribution , 1940 .

[2]  Raja Parasuraman,et al.  Effects of Imperfect Automation on Decision Making in a Simulated Command and Control Task , 2007, Hum. Factors.

[3]  N Moray,et al.  Trust, control strategies and allocation of function in human-machine systems. , 1992, Ergonomics.

[4]  Alexander Binder,et al.  The LRP Toolbox for Artificial Neural Networks , 2016, J. Mach. Learn. Res..

[5]  Pierre-Majorique Léger,et al.  In AI We Trust: Characteristics Influencing Assortment Planners' Perceptions of AI Based Recommendation Agents , 2018, HCI.

[6]  N. Roese,et al.  The Psychology of Counterfactual Thinking , 2009 .

[7]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[8]  Colin G. Drury,et al.  Foundations for an Empirically Determined Scale of Trust in Automated Systems , 2000 .

[9]  Paul N. Bennett,et al.  Guidelines for Human-AI Interaction , 2019, CHI.

[10]  L. Deng,et al.  The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[11]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[12]  René F. Kizilcec How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface , 2016, CHI.

[13]  Danielle R.M. Timmermans The impact of task complexity on information use in multi‐attribute decision making , 1993 .

[14]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[15]  Anind K. Dey,et al.  Assessing demand for intelligibility in context-aware applications , 2009, UbiComp.

[16]  Marcel van Gerven,et al.  Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges , 2018, ArXiv.

[17]  Jeffrey M. Bradshaw,et al.  Tomorrow’s Human–Machine Design Tools: From Levels of Automation to Interdependencies , 2018 .

[18]  S. Geisser,et al.  On methods in the analysis of profile data , 1959 .

[19]  Ellen Enkel,et al.  Applied artificial intelligence and trust—The case of autonomous vehicles and medical assistance devices , 2016 .

[20]  Amina Adadi,et al.  Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[21]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[22]  Gary Klein,et al.  Explaining Explanation For “Explainable Ai” , 2018, Proceedings of the Human Factors and Ergonomics Society Annual Meeting.

[23]  John D. Lee,et al.  Trust in Automation: Designing for Appropriate Reliance , 2004 .

[24]  Alexander Binder,et al.  Layer-Wise Relevance Propagation for Deep Neural Network Architectures , 2016 .

[25]  Gregg H. Gunsch,et al.  The effect of external safeguards on human-information system trust in an information warfare environment , 2001, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[26]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[27]  Sameer Singh,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[28]  Madlen Günther,et al.  Advancing electric vehicle range displays for enhanced user experience: the relevance of trust and adaptability , 2015, AutomotiveUI.

[29]  G. Hertel,et al.  Trust in teams: A taxonomy of perceived trustworthiness factors and risk-taking behaviors in face-to-face and virtual teams , 2020, Human Relations.

[30]  Justin Kruger,et al.  The Effort Heuristic , 2004 .

[31]  Mohan S. Kankanhalli,et al.  Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda , 2018, CHI.

[32]  Charles E. Billings,et al.  Human-Centered Aviation Automation: Principles and Guidelines , 1996 .

[33]  Eric Horvitz,et al.  Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure , 2018, HCOMP.

[34]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[35]  W. Dunlap,et al.  Meta-Analysis of Experiments With Matched Groups or Repeated Measures Designs , 1996 .

[36]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[37]  Todd Kulesza,et al.  Tell me more?: the effects of mental model soundness on personalizing an intelligent agent , 2012, CHI.

[38]  Thomas Franke,et al.  A Personal Resource for Technology Interaction: Development and Validation of the Affinity for Technology Interaction (ATI) Scale , 2019, Int. J. Hum. Comput. Interact..

[39]  David Woods,et al.  1. How to make automated systems team players , 2002 .

[40]  N. Moray,et al.  Trust in automation. Part II. Experimental studies of trust and human intervention in a process control simulation. , 1996, Ergonomics.

[41]  Peter A. Flach,et al.  Glass-Box: Explaining AI Decisions With Counterfactual Statements Through Conversation With a Voice-enabled Virtual Assistant , 2018, IJCAI.

[42]  Christopher P. Furner,et al.  The influence of information overload on the development of trust and purchase intention based on online product reviews in a mobile vs. web environment: an empirical investigation , 2016, Electronic Markets.

[43]  Gary Klein,et al.  Metrics for Explainable AI: Challenges and Prospects , 2018, ArXiv.

[44]  Ziyan Wu,et al.  Counterfactual Visual Explanations , 2019, ICML.

[45]  Markus Zanker The influence of knowledgeable explanations on users' perception of a recommender system , 2012, RecSys '12.