Linguistic Cognitive Load Analysis on Dialogues with an Intelligent Virtual Assistant

Virtual assistants have become fixtures in everyday settings, but most research focuses on their development rather than their use following deployment. To facilitate study of their use in office settings, we introduce OfficeDial , a multimodal dataset containing audio recordings, transcriptions, eye tracking data, and screen recordings from conversations between humans and virtual assistants in office environments. Conversations are paired with physical and behavioral measures of cognitive load. We study the associations between verbal behavior and noise level and reveal key relationships between verbal redundancy, disfluency, and noise level. We make our new dataset available to interested researchers to inspire further exploration.

[1]  Myunghee Kim,et al.  Effects of an intelligent virtual assistant on office task performance and workload in a noisy environment. , 2023, Applied ergonomics.

[2]  J. D. Ruiter,et al.  GailBot: An automatic transcription system for Conversation Analysis , 2022, Dialogue Discourse.

[3]  Natalie Parde,et al.  Artificial Intelligence in Rehabilitation Targeting the Participation of Children and Youth With Disabilities: Scoping Review , 2020, Journal of medical Internet research.

[4]  Stefan Smolnik,et al.  AI invading the workplace: negative emotions towards the organizational use of personal virtual assistants , 2021, Electronic Markets.

[5]  Zachary Eberhart,et al.  A Wizard of Oz Study Simulating API Usage Dialogues With a Virtual Assistant , 2021, IEEE Transactions on Software Engineering.

[6]  Cornelia Caragea,et al.  Identifying Medical Self-Disclosure in Online Communities , 2021, NAACL.

[7]  Gianluca Borghini,et al.  Wearable Technologies for Mental Workload, Stress, and Emotional State Assessment during Working-Like Tasks: A Comparison with Laboratory Technologies , 2021, Sensors.

[8]  Peter Gerjets,et al.  Measuring Cognitive Load Using In-Game Metrics of a Serious Simulation Game , 2021, Frontiers in Psychology.

[9]  Diarmuid Ó Séaghdha,et al.  Conversational Semantic Parsing for Dialog State Tracking , 2020, EMNLP.

[10]  Yixiang Lim,et al.  A Cyber-Physical-Human System for One-to-Many UAS Operations: Cognitive Load Analysis , 2020, Sensors.

[11]  Natalie Parde,et al.  Modeling Dialogue in Conversational Cognitive Health Screening Interviews , 2020, LREC.

[12]  Abdullah Mueen,et al.  Towards Awareness of Human Relational Strategies in Virtual Agents , 2020, AAAI.

[13]  Daniel McDuff,et al.  Design and evaluation of intelligent agent prototypes for assistance with focus and productivity at work , 2020, IUI.

[14]  Greg Welch,et al.  Reducing Task Load with an Embodied Intelligent Virtual Assistant for Improved Performance in Collaborative Decision Making , 2020, 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR).

[15]  Hongji Yang,et al.  Bot-X: An AI-based virtual assistant for intelligent manufacturing , 2020, Multiagent Grid Syst..

[16]  Andreas Holzinger,et al.  Human Annotated Dialogues Dataset for Natural Conversational Agents , 2020, Applied Sciences.

[17]  Fabio Babiloni,et al.  Mental Workload Monitoring: New Perspectives from Neuroscience , 2019, H-WORKLOAD.

[18]  Mona T. Diab,et al.  Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data , 2019, EMNLP.

[19]  Ce Zhang,et al.  ZuCo 2.0: A Dataset of Physiological Recordings During Natural Reading and Annotation , 2019, LREC.

[20]  Raghav Gupta,et al.  Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset , 2019, AAAI.

[21]  Bill Byrne,et al.  Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset , 2019, EMNLP.

[22]  Filip Radlinski,et al.  Coached Conversational Preference Elicitation: A Case Study in Understanding Movie Preferences , 2019, SIGdial.

[23]  F. Paas,et al.  Cognitive Architecture and Instructional Design: 20 Years Later , 2019, Educational Psychology Review.

[24]  Nora Hollenstein,et al.  ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading , 2018, Scientific Data.

[25]  Stefan Ultes,et al.  MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.

[26]  Jianfeng Gao,et al.  Neural Approaches to Conversational AI , 2018, ACL.

[27]  Maximilian Eibl,et al.  CoLoSS: Cognitive Load Corpus with Speech and Performance Data from a Symbol-Digit Dual-Task , 2018, LREC.

[28]  Jiliang Tang,et al.  A Survey on Dialogue Systems: Recent Advances and New Frontiers , 2017, SKDD.

[29]  Sven Fuchs,et al.  Multidimensional Real-Time Assessment of User State and Performance to Trigger Dynamic System Adaptation , 2017, HCI.

[30]  Christopher D. Manning,et al.  Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[31]  Mitesh M. Khapra,et al.  Towards Building Large Scale Multimodal Domain-Aware Conversation Systems , 2017, AAAI.

[32]  R. Sarikaya The Technology Behind Personal Digital Assistants: An overview of the system architecture and key components , 2017, IEEE Signal Processing Magazine.

[33]  David Griol,et al.  The Conversational Interface: Talking to Smart Devices , 2016 .

[34]  Abigail Sellen,et al.  "Like Having a Really Bad PA": The Gulf between User Expectation and Experience of Conversational Agents , 2016, CHI.

[35]  Fang Chen,et al.  Measuring Cognitive Load Using Linguistic Features: Implications for Usability Evaluation and Adaptive Interaction Design , 2014, Int. J. Hum. Comput. Interact..

[36]  J. A. Veltman,et al.  The Role of Operator State Assessment in Adaptive Automation , 2005 .

[37]  S. P. Marshall,et al.  The Index of Cognitive Activity: measuring cognitive workload , 2002, Proceedings of the IEEE 7th Conference on Human Factors and Power Plants.

[38]  Christian A. Müller,et al.  Recognizing Time Pressure and Cognitive Load on the Basis of Speech: An Experimental Study , 2001, User Modeling.

[39]  Sexton Jb,et al.  Analyzing cockpit communications: the links between language, performance, error, and workload. , 2000 .

[40]  Anthony Jameson,et al.  Interpreting symptoms of cognitive load in speech input , 1999 .

[41]  Eric Sundstrom,et al.  Office Noise, Satisfaction, and Performance , 1994 .

[42]  R. Gunning The Technique of Clear Writing. , 1968 .

[43]  M. C. Templin Certain language skills in children : their development and interrelationships , 1957 .

[44]  J. Tukey Comparing individual means in the analysis of variance. , 1949, Biometrics.

[45]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[46]  Natalie Parde,et al.  How You Say It Matters: Measuring the Impact of Verbal Disfluency Tags on Automated Dementia Detection , 2022, BIONLP.

[47]  Yang Wang,et al.  Robust Multimodal Cognitive Load Measurement , 2016, Human–Computer Interaction Series.

[48]  J. Sexton,et al.  Analyzing cockpit communications: the links between language, performance, error, and workload. , 2000, Human performance in extreme environments : the journal of the Society for Human Performance in Extreme Environments.

[49]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[50]  John W. Chotlos,et al.  IV. A statistical and comparative analysis of individual written language samples. , 1944 .