A Data-Efficient Deep Learning Approach for Deployable Multimodal Social Robots

The deep supervised and reinforcement learning paradigms (among others) have the potential to endow interactive multimodal social robots with the ability of acquiring skills autonomously. But it is still not very clear yet how they can be best deployed in real world applications. As a step in this direction, we propose a deep learning-based approach for efficiently training a humanoid robot to play multimodal games---and use the game of `Noughts \& Crosses' with two variants as a case study. Its minimum requirements for learning to perceive and interact are based on a few hundred example images, a few example multimodal dialogues and physical demonstrations of robot manipulation, and automatic simulations. In addition, we propose novel algorithms for robust visual game tracking and for competitive policy learning with high winning rates, which substantially outperform DQN-based baselines. While an automatic evaluation shows evidence that the proposed approach can be easily extended to new games with competitive robot behaviours, a human evaluation with 130 humans playing with the {\it Pepper} robot confirms that highly accurate visual perception is required for successful game play.

[1]  D. Kalles,et al.  A Minimax Tutor for Learning to Play a Board Game , 2008 .

[2]  Gabriel Skantze,et al.  Introduction for Speech and language for interactive robots , 2015, Comput. Speech Lang..

[3]  Jacob Russell Neterer,et al.  Deep Learning in Natural Language Processing , 2018, Proceedings of the West Virginia Academy of Science.

[4]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Nina Dethlefs,et al.  Dialogue Systems Using Online Learning: Beyond Empirical Methods , 2012, SDCTD@NAACL-HLT.

[7]  Eric Horvitz,et al.  Directions robot: in-the-wild experiences and lessons learned , 2014, AAMAS.

[8]  Heriberto Cuayáhuitl,et al.  Robot learning from verbal interaction: A brief survey , 2015 .

[9]  Ashutosh Vyas,et al.  Deep Learning for Natural Language Processing , 2016 .

[10]  Stephen Grossberg,et al.  A neural network model for cursive script production , 1993, Biological Cybernetics.

[11]  Heriberto Cuayáhuitl,et al.  Training an Interactive Humanoid Robot Using Multimodal Deep Reinforcement Learning , 2016, ArXiv.

[12]  Heriberto Cuayáhuitl,et al.  SimpleDS: A Simple Deep Reinforcement Learning Dialogue System , 2016, IWSDS.

[13]  Yuichiro Yoshikawa,et al.  Show, attend and interact: Perceivable human-robot social interaction through neural attention Q-network , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Kenji Suzuki,et al.  Comparative Study of Human Behavior in Card Playing with a Humanoid Playmate , 2014, Int. J. Soc. Robotics.

[15]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[16]  Nina Dethlefs,et al.  Domain Transfer for Deep Natural Language Generation from Abstract Meaning Representations , 2017, IEEE Computational Intelligence Magazine.

[17]  Shane Legg,et al.  Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[18]  Fabio Tesser,et al.  Multimodal child-robot interaction: building social bonds , 2013, HRI 2013.

[19]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[20]  Wolfram Burgard,et al.  Probabilistic Algorithms and the Interactive Museum Tour-Guide Robot Minerva , 2000, Int. J. Robotics Res..

[21]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[22]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[23]  Justin W. Hart,et al.  No fair!!: an interaction with a cheating robot , 2010, HRI 2010.

[24]  Matthew Glisson,et al.  Playing catch and juggling with a humanoid robot , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[25]  Emilia I. Barakova,et al.  Socially grounded game strategy enhances bonding and perceived smartness of a humanoid robot , 2018, Connect. Sci..

[26]  Heriberto Cuayáhuitl,et al.  Deep reinforcement learning for conversational robots playing games , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[27]  Jihie Kim,et al.  Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[28]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[29]  Stefan Wermter,et al.  Towards multimodal neural robot learning , 2004, Robotics Auton. Syst..

[30]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[31]  Yang Liu,et al.  Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening , 2016, ICLR.

[32]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[33]  Gordon Cheng,et al.  Learning to Act from Observation and Practice , 2004, Int. J. Humanoid Robotics.

[34]  Yuki Suga,et al.  Multimodal integration learning of robot behavior using deep neural networks , 2014, Robotics Auton. Syst..

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Yuichiro Yoshikawa,et al.  Robot gains social intelligence through multimodal deep reinforcement learning , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[37]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[38]  Seunghak Yu,et al.  Scaling up deep reinforcement learning for multi-domain dialogue systems , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[39]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[40]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[41]  Pamela J. Hinds,et al.  Robots in the Wild , 2018, ACM Trans. Hum. Robot Interact..