Ensuring a Robust Multimodal Conversational User Interface During Maintenance Work

It has been shown that the provision of a conversational user interface proves beneficial in many domains. But, there are still many challenges when applied in production areas, e.g. as part of a virtual assistant to support workers in knowledge-intensive maintenance work. Regarding input modalities, touchscreens are failure-prone in wet environments and the quality of voice recognition is negatively affected by ambient noise. Augmenting a symmetric text- and voice-based user interface with gestural input poses a good solution to provide both efficiency and a robust communication. This paper contributes to this research area by providing results on the application of appropriate head and one-hand gestures during maintenance work. We conducted an elicitation study with 20 participants and present a gesture set as its outcome. To facilitate the gesture development and integration for application designers, a classification model for head gestures and one for one-hand gestures were developed. Additionally, a proof-of-concept for operators’ acceptance regarding a multimodal conversational user interface with support of gestural input during maintenance work was demonstrated. It encompasses two usability testings with 18 participants in different realistic, but controlled settings: notebook repair (SUS: 82.1) and cutter head maintenance (SUS: 82.7).

[1]  Anthony Dunnigan,et al.  Use Your Head! Exploring Interaction Modalities for Hat Technologies , 2019, Conference on Designing Interactive Systems.

[2]  Markus Funk,et al.  Interactive worker assistance: comparing the effects of in-situ projection, head-mounted displays, tablet, and paper instructions , 2016, UbiComp.

[3]  Chinmay Kulkarni,et al.  Methods and Tools for Prototyping Voice Interfaces , 2020, CIU.

[4]  Claudio S. Pinhanez,et al.  Intentions, Meanings, and Whys: Designing Content for Voice-based Conversational Museum Guides , 2020, CIU.

[5]  Anna Syberfeldt,et al.  Augmented Reality Smart Glasses in the Smart Factory: Product Evaluation Guidelines and Review of Available Products , 2017, IEEE Access.

[6]  Nicholas D. Lane,et al.  On Robustness of Cloud Speech APIs: An Early Characterization , 2018, UbiComp/ISWC Adjunct.

[7]  Radu-Daniel Vatavu,et al.  Formalizing Agreement Analysis for Elicitation Studies: New Measures, Significance Test, and Toolkit , 2015, CHI.

[8]  Fabien Guillot,et al.  What is Natural?: Challenges and Opportunities for Conversational Recommender Systems , 2020, CIU.

[9]  Meredith Ringel Morris,et al.  User-defined gestures for surface computing , 2009, CHI.

[10]  Noëlle Carbonell,et al.  An experimental study of future “natural” multimodal human-computer interaction , 1993, CHI '93.

[11]  Manex Serras,et al.  Dialogue Enhanced Extended Reality: Interactive System for the Operator 4.0 , 2020, Applied Sciences.

[12]  Markus Funk,et al.  One size does not fit all: challenges of providing interactive worker assistance in industrial settings , 2017, UbiComp/ISWC Adjunct.

[13]  Brad A. Myers,et al.  Maximizing the guessability of symbolic input , 2005, CHI Extended Abstracts.

[14]  Benjamin Klöpper,et al.  Industrial Virtual Assistants: Challenges and Opportunities , 2018, UbiComp/ISWC Adjunct.

[15]  Dimitre Novatchev,et al.  Chunking and Phrasing and the Design of Human-Computer Dialogues - Response , 1986, IFIP Congress.

[16]  Christoph Amma,et al.  Kinemic Wave: A Mobile Freehand Gesture And Text-Entry System , 2016, CHI Extended Abstracts.

[17]  Teddy Seyed,et al.  User Elicitation on Single-hand Microgestures , 2016, CHI.

[18]  Jean D. Hallewell Haslwanter,et al.  Lost in translation: machine translation and text-to-speech in industry 4.0 , 2019, PETRA.

[19]  Radu-Daniel Vatavu,et al.  On free-hand TV control: experimental results on user-elicited gestures with Leap Motion , 2015, Personal and Ubiquitous Computing.

[20]  Martin W. Hoffmann,et al.  Proposal for requirements on industrial AI solutions , 2020, Machine Learning for Cyber Physical Systems.

[21]  R. B. Otto,et al.  Virtual Assistant to Real Time Training on Industrial Environment , 2018 .

[22]  Sriram Subramanian,et al.  Would you do that?: understanding social acceptance of gestural interfaces , 2010, Mobile HCI.

[23]  Jeff Sauro,et al.  Chapter 8 – Standardized usability questionnaires , 2016 .

[24]  Roope Raisamo,et al.  Comparison of three implementations of HeadTurn: a multimodal interaction technique with gaze and head turns , 2016, ICMI.

[25]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  Tomi Heimonen,et al.  A model for gathering and sharing knowledge in maintenance work , 2015, ECCE.

[27]  Anand Vardhan Bhalla,et al.  Comparative Study of Various Touchscreen Technologies , 2010 .

[28]  Elisabeth André,et al.  Pen + Mid-Air Gestures: Eliciting Contextual Gestures , 2018, ICMI.

[29]  C. Wittenberg Is multimedia always the solution for human-machine interfaces? - a case study in the service & maintenance domain , 2008, 2008 15th International Conference on Systems, Signals and Image Processing.

[30]  Yang Li,et al.  User-defined motion gestures for mobile interaction , 2011, CHI.

[31]  Michael Beigl ElectronicManual: helping users with ubiquitous access , 1999, HCI.

[32]  Alexander Maedche,et al.  Towards Designing Cooperative and Social Conversational Agents for Customer Service , 2017, ICIS.

[33]  Raquel Oliveira Prates,et al.  Here's What I Can Do: Chatbots' Strategies to Convey Their Features to Users , 2017, IHC.

[34]  Meng Yuan,et al.  Wearable Solution for Industrial Maintenance , 2015, CHI Extended Abstracts.

[35]  Angeliki Tzouganatou,et al.  Can Heritage Bots Thrive? Toward Future Engagement in Cultural Heritage , 2018, Advances in Archaeological Practice.

[36]  Wolfgang Wahlster,et al.  Dialogue Systems Go Multimodal: The SmartKom Experience , 2006, SmartKom.

[37]  S. Gon,et al.  Machine Learning in Human-computer Nonverbal Communication , 2019, NeuroManagement and Intelligent Computing Method on Multimodal Interaction.

[38]  Didier Stricker,et al.  Learning task structure from video examples for workflow tracking and authoring , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[39]  Alexander Maedche,et al.  Faster is Not Always Better: Understanding the Effect of Dynamic Response Delays in Human-Chatbot Interaction , 2018, ECIS.

[40]  Radu-Daniel Vatavu,et al.  Multi-Level Representation of Gesture as Command for Human Computer Interaction , 2008, Comput. Informatics.

[41]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[42]  Norbert Reithinger,et al.  Conversation is multimodal: thus conversational user interfaces should be as well , 2019, CUI.

[43]  Leon Urbas,et al.  The potential of smartwatches to support mobile industrial maintenance tasks , 2015, 2015 IEEE 20th Conference on Emerging Technologies & Factory Automation (ETFA).

[44]  René Lindorfer,et al.  Industrial Perspectives on Assistive Systems for Manual Assembly Tasks , 2018, PETRA.

[45]  Ying-Chao Tung,et al.  RainCheck: Overcoming Capacitive Interference Caused by Rainwater on Smartphones , 2018, ICMI.