Responses of Conversational Agents to Health and Lifestyle Prompts: Investigation of Appropriateness and Presentation Structures

Background Conversational agents (CAs) are systems that mimic human conversations using text or spoken language. Their widely used examples include voice-activated systems such as Apple Siri, Google Assistant, Amazon Alexa, and Microsoft Cortana. The use of CAs in health care has been on the rise, but concerns about their potential safety risks often remain understudied. Objective This study aimed to analyze how commonly available, general-purpose CAs on smartphones and smart speakers respond to health and lifestyle prompts (questions and open-ended statements) by examining their responses in terms of content and structure alike. Methods We followed a piloted script to present health- and lifestyle-related prompts to 8 CAs. The CAs’ responses were assessed for their appropriateness on the basis of the prompt type: responses to safety-critical prompts were deemed appropriate if they included a referral to a health professional or service, whereas responses to lifestyle prompts were deemed appropriate if they provided relevant information to address the problem prompted. The response structure was also examined according to information sources (Web search–based or precoded), response content style (informative and/or directive), confirmation of prompt recognition, and empathy. Results The 8 studied CAs provided in total 240 responses to 30 prompts. They collectively responded appropriately to 41% (46/112) of the safety-critical and 39% (37/96) of the lifestyle prompts. The ratio of appropriate responses deteriorated when safety-critical prompts were rephrased or when the agent used a voice-only interface. The appropriate responses included mostly directive content and empathy statements for the safety-critical prompts and a mix of informative and directive content for the lifestyle prompts. Conclusions Our results suggest that the commonly available, general-purpose CAs on smartphones and smart speakers with unconstrained natural language interfaces are limited in their ability to advise on both the safety-critical health prompts and lifestyle prompts. Our study also identified some response structures the CAs employed to present their appropriate responses. Further investigation is needed to establish guidelines for designing suitable response structures for different prompt types.

[1]  S MART H OME P ERSONAL A SSISTANTS : A S ECURITY AND P RIVACY R EVIEW , 2022 .

[2]  Cristiano André da Costa,et al.  Survey of conversational agents in health , 2019, Expert Syst. Appl..

[3]  Oscar Díaz,et al.  Using Health Chatbots for Behavior Change: A Mapping Study , 2019, Journal of Medical Systems.

[4]  J. Halamka,et al.  Chatbots and Conversational Agents in Mental Health: A Review of the Psychiatric Landscape , 2019, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[5]  Simon M. Lin,et al.  A Scoping Review of Patient-Facing, Behavioral Health Interventions with Voice Assistant Technology Targeting Self-management and Healthy Lifestyle Behaviors , 2019, SSRN Electronic Journal.

[6]  Russell Fulmer,et al.  Using Psychological Artificial Intelligence (Tess) to Relieve Symptoms of Depression and Anxiety: Randomized Controlled Trial , 2018, JMIR mental health.

[7]  Bingjie Liu,et al.  Should Machines Express Sympathy and Empathy? Experiments with a Health Advice Chatbot , 2018, Cyberpsychology Behav. Soc. Netw..

[8]  Teresa K. O'Leary,et al.  Patient and Consumer Safety Risks When Using Conversational Assistants for Medical Information: An Observational Study of Siri, Alexa, and Google Assistant , 2018, Journal of medical Internet research.

[9]  D. Taylor,et al.  Dr Google in the ED: searching for online health information by adult emergency department patients , 2018, The Medical journal of Australia.

[10]  Jessica A. Chen,et al.  Conversational agents in healthcare: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[11]  Leah Findlater,et al.  "Accessibility Came by Accident": Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities , 2018, CHI.

[12]  Jichen Zhu,et al.  Patterns for How Users Overcome Obstacles in Voice User Interfaces , 2018, CHI.

[13]  Farah Magrabi,et al.  Does health informatics have a replication crisis? , 2018, J. Am. Medical Informatics Assoc..

[14]  Nick Wilson,et al.  Just ask Siri? A pilot study comparing smartphone digital assistants and laptop Google searches for smoking cessation advice , 2018, PloS one.

[15]  Timothy W. Bickmore,et al.  Safety First: Conversational Agents for Health Care , 2018, Studies in Conversational UX Design.

[16]  Björn Hoffmeister,et al.  Just ASK: Building an Architecture for Extensible Self-Service Spoken Language Understanding , 2017, ArXiv.

[17]  Maria Klara Wolters,et al.  Designing a spoken dialogue interface to an intelligent cognitive assistant for people with dementia , 2016, Health Informatics J..

[18]  David Griol,et al.  The Conversational Interface: Talking to Smart Devices , 2016 .

[19]  Abigail Sellen,et al.  "Like Having a Really Bad PA": The Gulf between User Expectation and Experience of Conversational Agents , 2016, CHI.

[20]  Adam S. Miner,et al.  Smartphone-Based Conversational Agents and Responses to Questions About Mental Health, Interpersonal Violence, and Physical Health. , 2016, JAMA internal medicine.

[21]  Erina L. MacGeorge,et al.  Advice: Expanding the Communication Paradigm , 2016 .

[22]  David Griol,et al.  The Conversational Interface , 2016 .

[23]  D. Norman The Design of Everyday Things: Revised and Expanded Edition , 2013 .

[24]  Melissa T. Baysari,et al.  A review of medical error taxonomies: A human factors perspective , 2011 .

[25]  M. T. Motley Studies in applied interpersonal communication , 2008 .

[26]  B. Mishara,et al.  Which helper behaviors and intervention styles are related to better short-term outcomes in telephone crisis intervention? Results from a Silent Monitoring Study of Calls to the U.S. 1-800-SUICIDE Network. , 2007, Suicide & life-threatening behavior.

[27]  Rosalind W. Picard,et al.  Establishing the computer-patient working alliance in automated health behavior change interventions. , 2005, Patient education and counseling.

[28]  Vimla L. Patel,et al.  A cognitive taxonomy of medical errors , 2004, J. Biomed. Informatics.

[29]  P. Rosen,et al.  Rosen's Emergency Medicine: Concepts and Clinical Practice , 2002 .

[30]  Youngme Moon,et al.  Personalization and Personality: Some Effects of Customizing Message Style Based on Consumer Personality , 2002 .

[31]  Chris Baber,et al.  Designing habitable dialogues for speech-based interaction with computers , 2001, Int. J. Hum. Comput. Stud..

[32]  Enrico W. Coiera,et al.  Viewpoint: When Conversation Is Better Than Computation , 2000, J. Am. Medical Informatics Assoc..

[33]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[34]  Thomas C. Reeves,et al.  Mental models: A research focus for interactive learning systems , 1992 .

[35]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[36]  William C. Ogden Using Natural Language Interfaces , 1988 .

[37]  Donald A. Norman,et al.  Some observations on mental models , 1987 .

[38]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .