Safe Multimodal Communication in Human-Robot Collaboration

The new industrial settings are characterized by the presence of human and robots that work in close proximity, cooperating in performing the required job. Such a collaboration, however, requires to pay attention to many aspects. Firstly, it is crucial to enable a communication between this two actors that is natural and efficient. Secondly, the robot behavior must always be compliant with the safety regulations, ensuring always a safe collaboration. In this paper, we propose a framework that enables multi-channel communication between humans and robots by leveraging multimodal fusion of voice and gesture commands while always respecting safety regulations. The framework is validated through a comparative experiment, demonstrating that, thanks to multimodal communication, the robot can extract valuable information for performing the required task and additionally, with the safety layer, the robot can scale its speed to ensure the operator's safety.

[1]  Tauheed Khan Mohd,et al.  Multi-Modal Data Fusion in Enhancing Human-Machine Interaction for Robotic Applications: A Survey , 2022, ArXiv.

[2]  Alessandro Roncone,et al.  Real-time motion control of robotic manipulators for safe human-robot coexistence , 2022, Robotics Comput. Integr. Manuf..

[3]  D. Oller,et al.  The origin of language and relative roles of voice and gesture in early communication development. , 2021, Infant behavior & development.

[4]  Dana Kulic,et al.  A Proposed Set of Communicative Gestures for Human Robot Interaction and an RGB Image-based Gesture Recognizer Implemented in ROS , 2021, ArXiv.

[5]  Sami Haddadin,et al.  Fast and Safe Trajectory Planning: Solving the Cobot Performance/Safety Trade-Off in Human-Robot Shared Environments , 2021, IEEE Robotics and Automation Letters.

[6]  Cristian Secchi,et al.  A Safety-Aware Kinodynamic Architecture for Human-Robot Collaboration , 2021, IEEE Robotics and Automation Letters.

[7]  S. Levinson,et al.  Multimodal Language Processing in Human Communication , 2019, Trends in Cognitive Sciences.

[8]  Fan Zhang,et al.  MediaPipe: A Framework for Building Perception Pipelines , 2019, ArXiv.

[9]  Erik Cambria,et al.  A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.

[10]  Vincenzo Lippiello,et al.  A robust multimodal fusion framework for command interpretation in human-robot cooperation , 2017, 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[11]  Frank Schneider,et al.  Multimodal human communication — Targeting facial expressions, speech content and prosody , 2012, NeuroImage.

[12]  Martin Buss,et al.  Human-Robot Collaboration: a Survey , 2008, Int. J. Humanoid Robotics.

[13]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Lihui Wang,et al.  Towards Robust Human-Robot Collaborative Manufacturing: Multimodal Fusion , 2018, IEEE Access.