Effect of AI Explanations on Human Perceptions of Patient-Facing AI-Powered Healthcare Systems

Ongoing research efforts have been examining how to utilize artificial intelligence technology to help healthcare consumers make sense of their clinical data, such as diagnostic radiology reports. How to promote the acceptance of such novel technology is a heated research topic. Recent studies highlight the importance of providing local explanations about AI prediction and model performance to help users determine whether to trust AI’s predictions. Despite some efforts, limited empirical research has been conducted to quantitatively measure how AI explanations impact healthcare consumers’ perceptions of using patient-facing, AI-powered healthcare systems. The aim of this study is to evaluate the effects of different AI explanations on people's perceptions of AI-powered healthcare system. In this work, we designed and deployed a large-scale experiment (N = 3,423) on Amazon Mechanical Turk (MTurk) to evaluate the effects of AI explanations on people's perceptions in the context of comprehending radiology reports. We created four groups based on two factors—the extent of explanations for the prediction (High vs. Low Transparency) and the model performance (Good vs. Weak AI Model)—and randomly assigned participants to one of the four conditions. Participants were instructed to classify a radiology report as describing a normal or abnormal finding, followed by completing a post-study survey to indicate their perceptions of the AI tool. We found that revealing model performance information can promote people's trust and perceived usefulness of system outputs, while providing local explanations for the rationale of a prediction can promote understandability but not necessarily trust. We also found that when model performance is low, the more information the AI system discloses, the less people would trust the system. Lastly, whether human agrees with AI predictions or not and whether the AI prediction is correct or not could also influence the effect of AI explanations. We conclude this paper by discussing implications for designing AI systems for healthcare consumers to interpret diagnostic report.

[1]  Pouyan Esmaeilzadeh,et al.  Use of AI-based tools for healthcare purposes: a survey study from consumers’ perspectives , 2020, BMC Medical Informatics and Decision Making.

[2]  Eliza Strickland,et al.  IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care , 2019, IEEE Spectrum.

[3]  Daniel G. Goldstein,et al.  Manipulating and Measuring Model Interpretability , 2018, CHI.

[4]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[5]  Eric Vorm,et al.  Assessing Demand for Transparency in Intelligent Systems Using Machine Learning , 2018, 2018 Innovations in Intelligent Systems and Applications (INISTA).

[6]  Berkeley J. Dietvorst,et al.  Algorithm Aversion: People Erroneously Avoid Algorithms after Seeing Them Err , 2014 .

[7]  Dympna O'Sullivan,et al.  The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems , 2015, 2015 International Conference on Healthcare Informatics.

[8]  Andrew D. Miller,et al.  Assessing the Value of Transparency in Recommender Systems: An End-User Perspective , 2018, IntRS@RecSys.

[9]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[10]  Amina Adadi,et al.  Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[11]  Yasuo Yamashita,et al.  Magnetic Resonance Image Analysis for Brain CAD Systems with Machine Learning , 2012 .

[12]  Li Chen,et al.  Trust building with explanation interfaces , 2006, IUI '06.

[13]  Hsinchun Chen,et al.  DiabeticLink: A Health Big Data System for Patient Empowerment and Personalized Healthcare , 2013, ICSH.

[14]  Koen V. Hindriks,et al.  Do You Get It? User-Evaluated Explainable BDI Agents , 2010, MATES.

[15]  Gordon B. Davis,et al.  User Acceptance of Information Technology: Toward a Unified View , 2003, MIS Q..

[16]  Wanda Pratt,et al.  Transparent Queries: investigation users' mental models of search engines , 2001, SIGIR '01.

[17]  Charles E. Kahn,et al.  PORTER: a Prototype System for Patient-Oriented Radiology Reporting , 2016, Journal of Digital Imaging.

[18]  Jacob Solomon,et al.  Graphics help patients distinguish between urgent and non-urgent deviations in laboratory test results , 2016, J. Am. Medical Informatics Assoc..

[19]  Haiyi Zhu,et al.  “Brilliant AI Doctor” in Rural Clinics: Challenges in AI-Powered Clinical Decision Support System Deployment , 2021, CHI.

[20]  Retno Larasati,et al.  Building a Trustworthy Explainable AI in Healthcare , 2020, Human Computer Interaction and Emerging Technologies: Adjunct Proceedings from the INTERACT 2019 Workshops.

[21]  Thomas H. Payne,et al.  Patient portals and personal health information online: perception, access, and use by US adults , 2017, J. Am. Medical Informatics Assoc..

[22]  Yan Fossat,et al.  Physicians’ Perceptions of Chatbots in Health Care: Cross-Sectional Web-Based Survey , 2018, Journal of medical Internet research.

[23]  G. Elwyn,et al.  Shared decision making: really putting patients at the centre of healthcare , 2012, BMJ : British Medical Journal.

[24]  Dakuo Wang,et al.  Lay individuals' perceptions of artificial intelligence (AI)‐empowered healthcare systems , 2020, ASIST.

[25]  Ming Yin,et al.  Understanding the Effect of Accuracy on Trust in Machine Learning Models , 2019, CHI.

[26]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[27]  Daniel L. Rubin,et al.  Informatics methods to enable patient-centered radiology. , 2009, Academic radiology.

[28]  Gary L. Kreps,et al.  Applying Multiple Methods to Comprehensively Evaluate a Patient Portal’s Effectiveness to Convey Information to Patients , 2016, Journal of medical Internet research.

[29]  Jarrett K Rosenberg,et al.  Creating a patient-centered imaging service: determining what patients want. , 2011, AJR. American journal of roentgenology.

[30]  Elizabeth Kaltenbach1,et al.  On the Dual Nature of Transparency and Reliability: Rethinking Factors that Shape Trust in Automation , 2017 .

[31]  Henrik Boström,et al.  Trade-off between accuracy and interpretability for predictive in silico modeling. , 2011, Future medicinal chemistry.

[32]  Jesse Chandler,et al.  Using Mechanical Turk to Study Clinical Populations , 2013 .

[33]  John Riedl,et al.  Explaining collaborative filtering recommendations , 2000, CSCW '00.

[34]  Kai Zheng,et al.  Understanding Patient Questions about their Medical Records in an Online Health Forum: Opportunity for Patient Portal Design , 2017, AMIA.

[35]  R. Calvo,et al.  Application of Synchronous Text-Based Dialogue Systems in Mental Health Interventions: Systematic Review , 2017, Journal of medical Internet research.

[36]  Nadine B. Sarter,et al.  Supporting Trust Calibration and the Effective Use of Decision Aids by Presenting Dynamic System Confidence Information , 2006, Hum. Factors.

[37]  Yunfeng Zhang,et al.  Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making , 2020, FAT*.

[38]  Rashmi R. Sinha,et al.  The role of transparency in recommender systems , 2002, CHI Extended Abstracts.

[39]  Jie Xu,et al.  The practical implementation of artificial intelligence technologies in medicine , 2019, Nature Medicine.

[40]  Andrew B Rosenkrantz,et al.  Survey-Based Assessment of Patients' Understanding of Their Own Imaging Examinations. , 2015, Journal of the American College of Radiology : JACR.

[41]  B. Erickson,et al.  Machine Learning for Medical Imaging. , 2017, Radiographics : a review publication of the Radiological Society of North America, Inc.

[42]  Chen-Tan Lin,et al.  Expectations of Patients and Physicians Regarding Patient-Accessible Medical Records , 2005, Journal of medical Internet research.

[43]  Lora Aroyo,et al.  The effects of transparency on trust in and acceptance of a content-based art recommender , 2008, User Modeling and User-Adapted Interaction.

[44]  Leonard Berlin,et al.  Communicating results of all radiologic examinations directly to patients: has the time come? , 2007, AJR. American journal of roentgenology.

[45]  Ricky K. Taira,et al.  Research and applications: Imaging informatics for consumer health: towards a radiology patient portal , 2013, J. Am. Medical Informatics Assoc..

[46]  Joseph P. Simmons,et al.  Understanding Algorithm Aversion: Forecasters Erroneously Avoid Algorithms After Seeing them Err , 2014 .

[47]  Kenji Suzuki Machine Learning in Computer-Aided Diagnosis: Medical Imaging Intelligence and Analysis , 2012 .

[48]  Kai Zheng,et al.  Professional Medical Advice at your Fingertips , 2018, Proc. ACM Hum. Comput. Interact..

[49]  Vivian Lai,et al.  On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection , 2018, FAT.

[50]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Alun D. Preece,et al.  Interpretability of deep learning models: A survey of results , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[52]  Lauren Wilcox,et al.  "Hello AI": Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making , 2019, Proc. ACM Hum. Comput. Interact..

[53]  Dakuo Wang,et al.  Utilization of Self-Diagnosis Health Chatbots in Real-World Settings: Case Study , 2020, Journal of medical Internet research.

[54]  Yu Lu,et al.  Understanding Patient Information Needs About Their Clinical Laboratory Results: A Study of Social Q&A Site , 2019, MedInfo.

[55]  Regina A. Pomranky,et al.  The role of trust in automation reliance , 2003, Int. J. Hum. Comput. Stud..

[56]  Nasir M. Rajpoot,et al.  Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images , 2016, IEEE Trans. Medical Imaging.

[57]  Tom Nadarzynski,et al.  Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: A mixed-methods study , 2019, Digital health.

[58]  Mark O. Riedl,et al.  Automated rationale generation: a technique for explainable AI and its effects on human perceptions , 2019, IUI.

[59]  Nima Tajbakhsh,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE Transactions on Medical Imaging.

[60]  Hilary Johnson,et al.  Explanation facilities and interactive systems , 1993, IUI '93.

[61]  Fred D. Davis Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology , 1989, MIS Q..

[62]  M. J. Yuan,et al.  An Observational Study to Evaluate the Usability and Intent to Adopt an Artificial Intelligence–Powered Medication Reconciliation Tool , 2016, Interactive Journal of Medical Research.

[63]  Feng Tian,et al.  “Brilliant AI Doctor” in Rural Clinics: Challenges in AI-Powered Clinical Decision Support System Deployment , 2021, CHI.

[64]  Lauren Wilcox,et al.  Supporting Families in Reviewing and Communicating about Radiology Imaging Studies , 2017, CHI.

[65]  Geza Joos,et al.  On the Accuracy Versus Transparency Trade-Off of Data-Mining Models for Fast-Response PMU-Based Catastrophe Predictors , 2012, IEEE Transactions on Smart Grid.

[66]  René F. Kizilcec,et al.  How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface , 2016, CHI.

[67]  Chris Callison-Burch,et al.  Crowd control: Effectively utilizing unscreened crowd workers for biomedical data annotation , 2017, J. Biomed. Informatics.