The impact of human–robot multimodal communication on mental workload, usability preference, and expectations of robot behavior

Multimodal communication between humans and autonomous robots is essential to enhance effectiveness of human–robot team performance in complex, novel environments, such as in military intelligence, surveillance, and reconnaissance operations in urban settings. It is imperative that a systematic approach be taken to evaluate the factors that each modality contributes to the user’s ability to perform successfully and safely. This paper addresses the effects of unidirectional speech and gesture methods of communication on perceived workload, usability preferences, and expectations of robot behavior while commanding a robot teammate to perform a spatial-navigation task. Each type of communication was performed alone or simultaneously. Results reveal that although the speech-alone condition elicited the lowest level of perceived workload, the usability preference and expectations of robot behavior after interacting through each communication condition was the same. Further, workload ratings between the gesture and speech-gesture conditions were similar indicating systems that employ gesture communication could also support speech communication with little to no additional subjectively perceived cognitive burden on the user. Findings also reveal that workload alone should not be used as a sole determining factor of communication preference during system and task evaluation and design. Additionally, perceived workload did not seem to negatively impact the level of expectations regarding the robot’s behavior. Recommendations for future human–robot communication evaluation are provided.

[1]  Alexander H. Waibel,et al.  Enabling Multimodal Human–Robot Interaction for the Karlsruhe Humanoid Robot , 2007, IEEE Transactions on Robotics.

[2]  Daniel J. Barber,et al.  Experimental Environments for Dismounted Human-Robot Multimodal Communications , 2015, HCI.

[3]  Seiji Yamada,et al.  How Does the Difference Between Users’ Expectations and Perceptions About a Robotic Agent Affect Their Behavior? , 2011, International Journal of Social Robotics.

[4]  Nadine Sarter,et al.  Multiple-Resource Theory as a Basis for Multimodal Interface Design: Success Stories, Qualifications, and Research Needs , 2006 .

[5]  Jonathan Harris,et al.  Speech and gesture interfaces for squad-level human-robot teaming , 2014, Defense + Security Symposium.

[6]  D. Derryberry,et al.  Anxiety-related attentional biases and their regulation by attentional control. , 2002, Journal of abnormal psychology.

[7]  Yael Edan,et al.  Vision-based hand-gesture applications , 2011, Commun. ACM.

[8]  Daniel Barber,et al.  Achieving the Vision of Effective Soldier-Robot Teaming: Recent Work in Multimodal Communication , 2015, HRI.

[9]  Daniel J. Barber,et al.  Toward a Tactile Language for Human–Robot Interaction , 2015, Hum. Factors.

[10]  A. Billard,et al.  Effects of repeated exposure to a humanoid robot on children with autism , 2004 .

[11]  Eric T. Matson,et al.  Communication for Task Completion with Heterogeneous Robots , 2012, RiTA.

[12]  Donald A. Norman,et al.  Affordance, conventions, and design , 1999, INTR.

[13]  E. Hall,et al.  The Hidden Dimension , 1970 .

[14]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[15]  R. Krauss,et al.  Nonverbal Behavior and Nonverbal Communication: What do Conversational Hand Gestures Tell Us? , 1996 .

[16]  Timothy L. White,et al.  Effects of Unimodal and Multimodal Cues About Threat Locations on Target Acquisition and Workload , 2009 .

[17]  Karon E. MacLean,et al.  Gestures for industry Intuitive human-robot communication from human observation , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[18]  R. Zajonc Attitudinal effects of mere exposure. , 1968 .

[19]  Manja Lohse,et al.  Investigating the influence of situations and expectations in user behavior - empirical analysis in human-robot interaction , 2010 .

[20]  Manja Lohse,et al.  Bridging the gap between users' expectations and system evaluations , 2011, 2011 RO-MAN.

[21]  Angelo Cangelosi,et al.  Integration of Speech and Action in Humanoid Robots: iCub Simulation Experiments , 2011, IEEE Transactions on Autonomous Mental Development.

[22]  Florian Jentsch,et al.  Field Assessment of Multimodal Communication for Dismounted Human-Robot Teams , 2015 .

[23]  Tatsuya Nomura,et al.  The influence of people’s culture and prior experiences with Aibo on their attitude towards robots , 2006, AI & SOCIETY.

[24]  Marion G. Ceruti,et al.  Wireless Data Glove for Gesture-Based Robotic Control , 2009, HCI.

[25]  Julie A. Adams,et al.  Evaluating the applicability of current models of workload to peer-based human-robot teams , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26]  Marilyn A. Walker,et al.  The role of speech processing in human-computer intelligent communication , 1997, Speech Commun..

[27]  H S Vitense,et al.  Multimodal feedback: an assessment of performance and mental workload , 2003, Ergonomics.

[28]  R. Krauss,et al.  The Role of Speech-Related Arm/Hand Gestures in Word Retrieval , 2001 .

[29]  Annie Pauzie,et al.  Evaluation of Driver Mental Workload Facing New In-Vehicle Information and CommunicationTechnology , 2007 .

[30]  D. Hebb Textbook of psychology , 1958 .

[31]  Christopher D. Wickens,et al.  Multiple resources and performance prediction , 2002 .

[32]  Sharon L. Oviatt,et al.  Multimodal Interfaces: A Survey of Principles, Models and Frameworks , 2009, Human Machine Interaction.

[33]  Paul Ekman,et al.  Nonverbal behavior in psychotherapy research. , 1968 .

[34]  R. Dillmann,et al.  Using gesture and speech control for commanding a robot assistant , 2002, Proceedings. 11th IEEE International Workshop on Robot and Human Interactive Communication.

[35]  Florian Jentsch,et al.  From Tools to Teammates , 2011 .

[36]  H. Heckhausen Achievement motivation and its constructs: A cognitive model , 1977 .

[37]  Fernando Alonso-Martín,et al.  Multidomain Voice Activity Detection during Human-Robot Interaction , 2013, ICSR.

[38]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[39]  J. P. Foley,et al.  Gesture and Environment , 1942 .

[40]  Christian B. Carstens,et al.  Intuitive Speech-based Robotic Control , 2010 .

[41]  Gustav Theodor Fechner,et al.  Vorschule der Aesthetik , 1876 .

[42]  Ali Sekmen,et al.  Human–robot interaction via voice-controllable intelligent user interface , 2007, Robotica.

[43]  Lorelei Lingard,et al.  Team Communications in the Operating Room: Talk Patterns, Sites of Tension, and Implications for Novices , 2002, Academic medicine : journal of the Association of American Medical Colleges.

[44]  Nicu Sebe,et al.  Multimodal Human Computer Interaction: A Survey , 2005, ICCV-HCI.

[45]  Sharon Oviatt,et al.  Multimodal Interfaces , 2008, Encyclopedia of Multimedia.

[46]  H. Kelley The warm-cold variable in first impressions of persons. , 1950, Journal of personality.

[47]  Christian B. Carstens,et al.  Speech-Based Robotic Control for Dismounted Soldiers: Evaluation of Visual Display Options , 2014 .

[48]  Daniel J. Barber,et al.  The Psychometrics of Mental Workload , 2015, Hum. Factors.

[49]  Lauren Reinerman-Jones,et al.  Visual and tactile interfaces for bi-directional human robot communication , 2013, Defense, Security, and Sensing.

[50]  Alistair D. N. Edwards,et al.  An improved auditory interface for the exploration of lists , 1997, MULTIMEDIA '97.

[51]  Greg Mori,et al.  “You two! Take off!”: Creating, modifying and commanding groups of robots using face engagement and indirect speech in voice commands , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[52]  Frédéric Lerasle,et al.  Two-handed gesture recognition and fusion with speech to command a robot , 2011, Autonomous Robots.

[53]  Alan C. Schultz,et al.  Using a natural language and gesture interface for unmanned vehicles , 2000, Defense, Security, and Sensing.

[54]  Norman I. Badler,et al.  Defining Next-Generation Multi-Modal Communication in Human Robot Interaction , 2011 .

[55]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[56]  Qi Sun,et al.  Design and implementation of human-robot interactive demonstration system based on Kinect , 2012, 2012 24th Chinese Control and Decision Conference (CCDC).

[57]  M. Argyle,et al.  A Cross-Cultural Study of the Communication of Extra-Verbal Meaning by Gesture , 1975 .

[58]  Daniel Barber,et al.  Defining next-generation multi-modal communication in human robot interaction: (578902012-095) , 2011 .

[59]  Patrick Lin,et al.  Autonomous Military Robotics: Risk, Ethics, and Design , 2008 .

[60]  Alan C. Schultz,et al.  Towards Seamless Integration in a Multi-modal Interface , 2000 .

[61]  Ruth B. Ekstrom Cognitive factors: Their identification and replication. , 1979 .

[62]  F.Tito Arecchi,et al.  Cognition and Reality , 2017, 1802.09627.

[63]  Attawith Sudsang,et al.  Formation Control for Multi-Robot Teams Using A Data Glove , 2008, 2008 IEEE Conference on Robotics, Automation and Mechatronics.

[64]  Donna Frick-Horbury The use of hand gestures as self-generated cues for recall of verbally associated targets. , 2002, The American journal of psychology.

[65]  Jong-Dae Won,et al.  A study on robust control of mobile robot by voice command , 2013, 2013 13th International Conference on Control, Automation and Systems (ICCAS 2013).

[66]  Leila Takayama,et al.  Influences on proxemic behaviors in human-robot interaction , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[67]  Cynthia Breazeal,et al.  Recognition of Affective Communicative Intent in Robot-Directed Speech , 2002, Auton. Robots.

[68]  M. S. Mayzner,et al.  Cognition And Reality , 1976 .

[69]  Polona Vilar Usability Engineering: Process, Products and Examples , 2010, J. Assoc. Inf. Sci. Technol..

[70]  Cigdem Kagitcibasi,et al.  Autonomy and Relatedness in Cultural Context , 2005 .

[71]  P. Langdon,et al.  Designing a More Inclusive World , 2012 .

[72]  E. Higgins,et al.  Social Psychology: Handbook of Basic Principles , 1998 .

[73]  Michael J. Beatty,et al.  Public speaking state anxiety as a function of selected situational and predispositional variables , 1990 .

[74]  Peter A. Hancock Mind, Machine and Morality: Toward a Philosophy of Human-Technology Symbiosis , 2017 .

[75]  Paul Ekman,et al.  Emotional and Conversational Nonverbal Signals , 2004 .

[76]  Donna Frick-Horbury,et al.  The Effects of Hand Gestures on Verbal Recall as a Function of High- and Low-Verbal-Skill Levels , 2002, The Journal of general psychology.

[77]  Nicu Sebe,et al.  Multimodal Human Computer Interaction: A Survey , 2005, ICCV-HCI.