More Than Accuracy: Towards Trustworthy Machine Learning Interfaces for Object Recognition

This paper investigates the user experience of visualizations of a machine learning (ML) system that recognizes objects in images. This is important since even good systems can fail in unexpected ways as misclassifications on photo-sharing websites showed. In our study, we exposed users with a background in ML to three visualizations of three systems with different levels of accuracy. In interviews, we explored how the visualization helped users assess the accuracy of systems in use and how the visualization and the accuracy of the system affected trust and reliance. We found that participants do not only focus on accuracy when assessing ML systems. They also take the perceived plausibility and severity of misclassification into account and prefer seeing the probability of predictions. Semantically plausible errors are judged as less severe than errors that are implausible, which means that system accuracy could be communicated through the types of errors.

[1]  Wanda Pratt,et al.  Understanding quantified-selfers' practices in collecting and exploring personal data , 2014, CHI.

[2]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[3]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[4]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[5]  Michael Veale,et al.  Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making , 2018, CHI.

[6]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Jiri Matas,et al.  All you need is a good init , 2015, ICLR.

[8]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[9]  John Riedl,et al.  Explaining collaborative filtering recommendations , 2000, CSCW '00.

[10]  Edward Rolf Tufte,et al.  The visual display of quantitative information , 1985 .

[11]  Paulo E. Rauber,et al.  Visualizing the Hidden Activity of Artificial Neural Networks , 2017, IEEE Transactions on Visualization and Computer Graphics.

[12]  Joep W. Frens,et al.  Interaction relabelling and extreme characters: methods for exploring aesthetic interactions , 2000, DIS '00.

[13]  S. Fuchs Trust and Power , 2019, Contemporary Sociology: A Journal of Reviews.

[14]  P. Mayring On Generalization in Qualitatively Oriented Research , 2007 .

[15]  Edward R. Tufte,et al.  The Visual Display of Quantitative Information , 1986 .

[16]  Sebastian Thrun,et al.  Practical object recognition in autonomous driving and beyond , 2011, Advanced Robotics and its Social Impacts.

[17]  Benjamin Graham,et al.  Fractional Max-Pooling , 2014, ArXiv.

[18]  Emilee J. Rader,et al.  Explanations as Mechanisms for Supporting Algorithmic Transparency , 2018, CHI.

[19]  Annika Wærn,et al.  Towards Algorithmic Experience: Initial Efforts for Social Media Contexts , 2018, CHI.

[20]  Colin Camerer,et al.  Not So Different After All: A Cross-Discipline View Of Trust , 1998 .

[21]  Thomas G. Dietterich,et al.  Interacting meaningfully with machine learning systems: Three experiments , 2009, Int. J. Hum. Comput. Stud..

[22]  Karrie Karahalios,et al.  A path to understanding the effects of algorithm awareness , 2014, CHI Extended Abstracts.

[23]  Meredith Ringel Morris,et al.  Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images , 2017, CHI.

[24]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[25]  Joe Tullio,et al.  How it works: a field study of non-technical users interacting with an intelligent system , 2007, CHI.

[26]  Alexander M. Rush,et al.  LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[27]  N. Moray,et al.  Trust in automation. Part II. Experimental studies of trust and human intervention in a process control simulation. , 1996, Ergonomics.

[28]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[30]  Kim Halskov,et al.  UX Design Innovation: Challenges for Working with Machine Learning as a Design Material , 2017, CHI.

[31]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Bonnie M. Muir,et al.  Trust in automation. I: Theoretical issues in the study of trust and human intervention in automated systems , 1994 .

[33]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[34]  John D. Lee,et al.  Trust in Automation: Designing for Appropriate Reliance , 2004, Hum. Factors.

[35]  Been Kim,et al.  Interactive and interpretable machine learning models for human machine collaboration , 2015 .

[36]  Izak Benbasat,et al.  Explanations From Intelligent Systems: Theoretical Foundations and Implications for Practice , 1999, MIS Q..

[37]  Chenchen Liu,et al.  How convolutional neural networks see the world - A survey of convolutional neural network visualization methods , 2018, Math. Found. Comput..

[38]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[39]  Carla F. Griggio,et al.  Customizations and Expression Breakdowns in Ecosystems of Communication Apps , 2019, Proc. ACM Hum. Comput. Interact..

[40]  Karrie Karahalios,et al.  "Be Careful; Things Can Be Worse than They Appear": Understanding Biased Algorithms and Users' Behavior Around Them in Rating Platforms , 2017, ICWSM.

[41]  Henriette Cramer,et al.  Awareness, training and trust in interaction with adaptive spam filters , 2009, CHI.

[42]  Alexander M. Rush,et al.  Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks , 2016, ArXiv.