A Multimodal Human-Machine Interaction Scheme for an Intelligent Robotic Nurse

This paper presents a new multimodal Human-Machine Interaction (HMI) scheme-model for the co-operation of a robotic-nurse (here a robotic wheelchair) and its human user. The HMI model processes vocal commands through a Personalized Isolated Word Recognition System (PIWRS) along with the recognition of Body Pose Angles (BPA) for decision-making in real time. In particular, the HMI scheme is able to recognize: (i) a set of voice commands, (ii) a set of body postures and poses and (iii) calculate the appropriate body angles associated to skeletal data obtained through a set of cameras. Furthermore, the HMI scheme receives specific values provided by pressure sensors, which are being utilized by the user throughout the duration of the tasks to be executed that compose the Active Participation System (APS). All these variables are appropriately combined for the safe control of an Autonomous Intelligent Robotic Wheelchair (AIRW) used by people in need. More specifically, the stand-up, turn-around and sit-down are the procedural steps under study.

[1]  Jozsef Suto,et al.  Human activity recognition using neural networks , 2014, Proceedings of the 2014 15th International Carpathian Control Conference (ICCC).

[2]  Alan V. Oppenheim,et al.  Discrete-time Signal Processing. Vol.2 , 2001 .

[3]  Navnath S. Nehe,et al.  DWT and LPC based feature extraction methods for isolated word recognition , 2012, EURASIP Journal on Audio, Speech, and Music Processing.

[4]  Zheng Fang,et al.  Comparison of different implementations of MFCC , 2001 .

[5]  Gunnar Johannsen,et al.  Human-Machine Interaction , 2019, CIRP Encyclopedia of Production Engineering.

[6]  Li Deng,et al.  Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition , 2011, Robust Speech Recognition of Uncertain or Missing Data.

[7]  Jungpil Shin,et al.  Hand Gesture and Character Recognition Based on Kinect Sensor , 2014, Int. J. Distributed Sens. Networks.

[8]  Gwenn Englebienne,et al.  Posture recognition with a top-view camera , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Wenbing Zhao,et al.  A feasibility study of using a single Kinect sensor for rehabilitation exercises monitoring: A rule based approach , 2014, 2014 IEEE Symposium on Computational Intelligence in Healthcare and e-health (CICARE).

[10]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[11]  Ruzena Bajcsy,et al.  Evaluation of upper extremity reachable workspace using Kinect camera. , 2013, Technology and health care : official journal of the European Society for Engineering and Medicine.

[12]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[13]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[14]  Li-Chen Fu,et al.  On-line human action recognition by combining joint tracking and key pose recognition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Chong Wang,et al.  Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[16]  Li Deng,et al.  Speech-Centric Information Processing: An Optimization-Oriented Approach , 2013, Proceedings of the IEEE.

[17]  Nikolaos G. Bourbakis,et al.  An autonomous intelligent wheelchair for assisting people at need in smart homes: A case study , 2015, 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA).

[18]  Alex Pentland,et al.  Human-Centred Intelligent Human-Computer Interaction (HCI2): how far are we from attaining it? , 2008, Int. J. Auton. Adapt. Commun. Syst..

[19]  Himanshu Joshi,et al.  Isolated word recognition in the Sigma cognitive architecture , 2014, BICA 2014.

[20]  A. Roy,et al.  Enhancing effectiveness of motor rehabilitation using kinect motion sensing technology , 2013, 2013 IEEE Global Humanitarian Technology Conference: South Asia Satellite (GHTC-SAS).

[21]  Leonid Sigal Human Pose Estimation , 2014, Computer Vision, A Reference Guide.

[22]  Nikolaos G. Bourbakis,et al.  Assistive Intelligent Robotic Wheelchairs , 2017, IEEE Potentials.

[23]  Nikolaos G. Bourbakis,et al.  An SPN Modeling of the H-IRW Getting-Up Task , 2016, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI).

[24]  Markus Windolf,et al.  Systematic accuracy and precision analysis of video motion capturing systems--exemplified on the Vicon-460 system. , 2008, Journal of biomechanics.

[25]  Remzi Serdar Kurcan Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM) , 2006 .

[26]  Johannes Wagner,et al.  The Social Signal Interpretation Framework (SSI) for Real Time Signal Processing and Recognition , 2011, INTERSPEECH.

[27]  Timothy I. Bell Extensive Reading: Speed and Comprehension. , 2001 .

[28]  Nikolaos G. Bourbakis,et al.  A survey on robotic wheelchairs mounted with robotic arms , 2015, 2015 National Aerospace and Electronics Conference (NAECON).

[29]  Noel E. O'Connor,et al.  Low-cost accurate skeleton tracking based on fusion of kinect and wearable inertial sensors , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[30]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[31]  Matteo Munaro,et al.  Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[32]  Michelle Karg,et al.  Body Movements for Affective Expression: A Survey of Automatic Recognition and Generation , 2013, IEEE Transactions on Affective Computing.

[33]  Tobias Baur,et al.  The social signal interpretation (SSI) framework: multimodal signal processing and recognition in real-time , 2013, ACM Multimedia.

[34]  Leena R Mehta,et al.  COMPARATIVE STUDY OF MFCC ANDLPC FOR MARATHI ISOLATED WORDRECOGNITION SYSTEM , 2013 .

[35]  Rishi Pal Singh,et al.  Automatic Speech Recognition: A Review , 2012 .

[36]  Pietro Siciliano,et al.  Human posture recognition with a time-of-flight 3D sensor for in-home applications , 2013, Expert Syst. Appl..

[37]  Dimitrios S. Koliousis Real-time speech recognition system for robotic control applications using an ear-microphone , 2007 .

[38]  张国亮,et al.  Comparison of Different Implementations of MFCC , 2001 .

[39]  Yifan Gong,et al.  An Overview of Noise-Robust Automatic Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[40]  C. Vijaya,et al.  Comparison of DTW and HMM for isolated word recognition , 2012, International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012).

[41]  Mubarak Shah,et al.  Human Pose Estimation in Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).