Robust Multimodal Command Interpretation for Human-Multirobot Interaction

In this work, we propose a multimodal interaction framework for robust human-multirobot communication in outdoor environments. In these scenarios, several human or environmental factors can cause errors, noise and wrong interpretations of the commands. The main goal of this work is to improve the robustness of human-robot interaction systems in similar situations. In particular, we propose a multimodal fusion method based on the following steps: for each communication channel, unimodal classifiers are firstly deployed in order to generate unimodal interpretations of the human inputs; the unimodal outcomes are then grouped into different multimodal recognition lines, each representing a possible interpretation of a sequence of multimodal inputs; these lines are finally assessed to recognize the human commands. We discuss the system at work in a real world case study in the SHERPA domain.

[1]  Frédéric Lerasle,et al.  Two-handed gesture recognition and fusion with speech to command a robot , 2011, Autonomous Robots.

[2]  Yang Li,et al.  Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes , 2007, UIST.

[3]  Vincenzo Lippiello,et al.  A robust multimodal fusion framework for command interpretation in human-robot cooperation , 2017, 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[4]  Gerhard Rigoll,et al.  A Multimodal Human-Robot-Interaction Scenario: Working Together with an Industrial Robot , 2009, HCI.

[5]  Sharon L. Oviatt,et al.  Multimodal Interfaces: A Survey of Principles, Models and Frameworks , 2009, Human Machine Interaction.

[6]  Roland Siegwart,et al.  The SHERPA project: Smart collaboration between humans and ground-aerial robots for improving rescuing activities in alpine environments , 2012, 2012 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[7]  N. Mimmo,et al.  A control architecture for multiple drones operated via multimodal interaction in search & rescue mission , 2016, 2016 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[8]  Lorenzo Sabattini,et al.  A Natural Infrastructure-Less Human–Robot Interaction System , 2017, IEEE Robotics and Automation Letters.

[9]  Mohan S. Kankanhalli,et al.  Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.

[10]  Rainer Stiefelhagen,et al.  Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures , 2004, ICMI '04.

[11]  Silvia Rossi,et al.  An extensible architecture for robust multimodal human-robot communication , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Vincenzo Lippiello,et al.  Mixed-Initiative Planning and Execution for Multiple Drones in Search and Rescue Missions , 2015, ICAPS.

[13]  Silvia Rossi,et al.  A dialogue system for multimodal human-robot interaction , 2013, ICMI '13.