Multimodal Interface for Ambient Assisted Living

Multimodal interfaces seamlessly integrate two or more user inputs in a coordinated manner to enhance user interaction with the system. This work proposes a multimodal interface that makes use of hand tracking and speech interactions for Ambient Intelligent and Ambient Assisted Living environments. The system is composed of two main modules: Spoken Language Interaction module and Hand Tracking Interaction module. Spoken Language Interaction use speech recognition services to input the user utterance to a natural language understanding module from which the dialog manager will match the context and output the corresponding answer to the user. Hand Tracking Interactions is performed using a Microsoft Kinect device to detect the 3D coordinate of the hand which is stabilized and transformed to be used as a cursor on the screen. Tests have been run on the speech recognition and natural language understanding framework and reached an accuracy of 83.7% in offering the user a correct answer from the first try. The accuracy results of the Dialog manager in offering the user a correct output was 91.6% computed on a data set of 3900 sentences.

[1]  Marie-Luce Bourguet,et al.  Designing and Prototyping Multimodal Commands , 2003, INTERACT.

[2]  Timothy Brittain-Catlin Put it there , 2013 .

[3]  Andrei Bursuc,et al.  ALICE - Assistance for better mobility and improved cognition of elderly blind and visually impaired , 2013 .

[4]  Gang Zhou,et al.  Accurate, Fast Fall Detection Using Gyroscopes and Accelerometer-Derived Posture Information , 2009, 2009 Sixth International Workshop on Wearable and Implantable Body Sensor Networks.

[5]  Renato De Mori,et al.  Spoken language understanding: a survey , 2007, ASRU.

[6]  Brigitte Meillon,et al.  Evaluation of a Context-Aware Voice Interface for Ambient Assisted Living , 2015, ACM Trans. Access. Comput..

[7]  Pierre Lison,et al.  A hybrid approach to dialogue management based on probabilistic rules , 2015, Comput. Speech Lang..

[8]  Dandu Amarnatha Reddy Vision Based Hand Gesture Recognition for Human Computer Interaction , 2018 .

[9]  Jeannette G. Neal,et al.  Multi-Modal References in Human-Computer Dialogue , 1988, AAAI.

[10]  Saso Koceski,et al.  ICT Innovations 2015-Emerging Technologies for Better Living , 2016 .

[11]  Manfred K. Warmuth,et al.  THE CMU SPHINX-4 SPEECH RECOGNITION SYSTEM , 2001 .

[12]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[13]  Matthew Turk,et al.  Multimodal interaction: A review , 2014, Pattern Recognit. Lett..

[14]  Gary Geunbae Lee,et al.  Recent Approaches to Dialog Management for Spoken Dialog Systems , 2010, J. Comput. Sci. Eng..