Research on multimodal human-robot interaction based on speech and gesture

Abstract This paper presents a multimodal human-robot interaction based on fusion of speech and gesture. In the interface, a robot control command system is designed, which can transform the speech and gesture of users into commands that the robot can execute. Microsoft speech SDK is used in this system to collect the speech of the operator. Then, a corpus-based algorithm of maximum entropy classification for natural language understanding is employed to generate commands. Leap Motion is employed to capture the gesture of operator in this system. Interval Kalman Filter (IKF) is used to estimate the measured data to reduce the inherent noise of the sensor. The advantage of the proposed method is that the combination of speech and gesture makes the human-robot interaction more convenient and direct. Finally, a series of experiments were carried out to validate our method, and proved that it performed better than the other proposed methods.

[1]  Ljupco Kocarev,et al.  Tracking Control of Networked Multi-Agent Systems Under New Characterizations of Impulses and Its Applications in Robotic Systems , 2016, IEEE Transactions on Industrial Electronics.

[2]  Annamária R. Várkonyi-Kóczy,et al.  Human–Computer Interaction for Smart Environment Applications Using Fuzzy Hand Posture and Gesture Models , 2011, IEEE Transactions on Instrumentation and Measurement.

[3]  Hamid Reza Karimi,et al.  State estimation on positive Markovian jump systems with time-varying delay and uncertain transition probabilities , 2016, Inf. Sci..

[4]  Xiufeng He,et al.  MEMS IMU and two-antenna GPS integration navigation system using interval adaptive Kalman filter , 2013, IEEE Aerospace and Electronic Systems Magazine.

[5]  Maja J. Mataric,et al.  Modeling dynamic spatial relations with global properties for natural language-based human-robot interaction , 2013, 2013 IEEE RO-MAN.

[6]  Jerome A. Feldman,et al.  Exploiting deep semantics and compositionality of natural language for Human-Robot-Interaction , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Uwe Reyle,et al.  From Discourse to Logic - Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , 1993, Studies in linguistics and philosophy.

[8]  Walter Schön,et al.  A fault tolerant architecture for data fusion: A real application of Kalman filters for mobile robot localization , 2017, Robotics Auton. Syst..

[9]  Frank Weichert,et al.  Analysis of the Accuracy and Robustness of the Leap Motion Controller , 2013, Sensors.

[10]  Xingyu Wang,et al.  Decentralized unscented Kalman filter based on a consensus algorithm for multi-area dynamic state estimation in power systems , 2015 .

[11]  L BergerAdam,et al.  A maximum entropy approach to natural language processing , 1996 .

[12]  Javier Ruiz Hidalgo,et al.  Detecting end-effectors on 2.5D data using geometric deformable models: Application to human pose estimation , 2013, Comput. Vis. Image Underst..

[13]  Haoyong Yu,et al.  Multi-modal control scheme for rehabilitation robotic exoskeletons , 2017, Int. J. Robotics Res..

[14]  Chun-hung Li,et al.  Minimum cross entropy thresholding , 1993, Pattern Recognit..

[15]  Ashutosh Saxena,et al.  Tell me Dave: Context-sensitive grounding of natural language to manipulation instructions , 2014, Int. J. Robotics Res..

[16]  Xin Liu,et al.  Markerless Human–Manipulator Interface Using Leap Motion With Interval Kalman Filter and Improved Particle Filter , 2016, IEEE Transactions on Industrial Informatics.

[17]  Ping Zhang,et al.  Human–Manipulator Interface Based on Multisensory Process via Kalman Filters , 2014, IEEE Transactions on Industrial Electronics.

[18]  Wojciech Skut,et al.  A Maximum-Entropy Partial Parser for Unrestricted Text , 1998, VLC@COLING/ACL.

[19]  Guanglong Du,et al.  Online Serial Manipulator Calibration Based on Multisensory Process Via Extended Kalman and Particle Filters , 2014, IEEE Transactions on Industrial Electronics.

[20]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[21]  Yasushi Shimizu,et al.  Efficient path planning of humanoid robots with automatic conformation of body representation to the complexity of environments , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[22]  Guanglong Du,et al.  A Markerless Human–Robot Interface Using Particle Filter and Kalman Filter for Dual Robots , 2015, IEEE Transactions on Industrial Electronics.

[23]  Peter J. Bryant,et al.  System design of a C-17 radome test station , 2013 .

[24]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[25]  Marie Tahon,et al.  Inference of Human Beings’ Emotional States from Speech in Human–Robot Interactions , 2015, Int. J. Soc. Robotics.