An Application of Convolutional Neural Networks on Human Intention Prediction

Due to the rapidly increasing need of human-robot interaction (HRI), more intelligent robots are in demand. However, the vast majority of robots can only follow strict instructions, which seriously restricts their flexibility and versatility. A critical fact that strongly negates the experience of HRI is that robots cannot understand human intentions. This study aims at improving the robotic intelligence by training it to understand human intentions. Different from previous studies that recognizing human intentions from distinctive actions, this paper introduces a method to predict human intentions before a single action is completed. The experiment of throwing a ball towards designated targets are conducted to verify the effectiveness of the method. The proposed deep learning based method proves the feasibility of applying convolutional neural networks (CNN) under a novel circumstance. Experiment results show that the proposed CNN-vote method out competes three traditional machine learning techniques. In current context, the CNN-vote predictor achieves the highest testing accuracy with relatively less data needed.

[1]  Martin Buss,et al.  An HMM approach to realistic haptic human-robot interaction , 2009, World Haptics 2009 - Third Joint EuroHaptics conference and Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems.

[2]  Lin Zhang,et al.  A Preliminary Study on a Robot's Prediction of Human Intention , 2017, 2017 IEEE 7th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER).

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Minho Lee,et al.  Human-Robot Interaction using Intention Recognition , 2015, HAI.

[5]  Katsushi Ikeuchi,et al.  Flexible cooperation between human and robot by interpreting human intention from gaze information , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[6]  Xiaoshuai Sun,et al.  Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length , 2018, IEEE Transactions on Multimedia.

[7]  Anind K. Dey,et al.  Probabilistic pointing target prediction via inverse optimal control , 2012, IUI '12.

[8]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[9]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[10]  Lin Zhang,et al.  Improving Human Intention Prediction Using Data Augmentation , 2018, 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  J. Bodner,et al.  First experiences with the da Vinci operating robot in thoracic surgery. , 2004, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[13]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[14]  Bernhard Schölkopf,et al.  Probabilistic movement modeling for intention inference in human–robot interaction , 2013, Int. J. Robotics Res..

[15]  Darius Burschka,et al.  Predicting human intention in visual observations of hand/object interactions , 2013, 2013 IEEE International Conference on Robotics and Automation.

[16]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Christian Laugier,et al.  Intentional motion on-line learning and prediction , 2008, Machine Vision and Applications.

[18]  Mark Micire Evolution and field performance of a rescue robot , 2008, J. Field Robotics.

[19]  Stefan Wermter,et al.  Emotional expression recognition with a cross-channel convolutional neural network for human-robot interaction , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[20]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[21]  Celestine A. Ntuen,et al.  A BAYESIAN ABDUCTION MODEL FOR EXTRACTING THE MOST PROBABLE EVIDENCE TO SUPPORT SENSEMAKING , 2015 .

[22]  David Akopian,et al.  Human Activity Tracking by Mobile Phones Through Hebbian Learning , 2016 .

[23]  Francesco Mondada,et al.  The e-puck, a Robot Designed for Education in Engineering , 2009 .

[24]  Jodi Forlizzi,et al.  Service robots in the domestic environment: a study of the roomba vacuum in the home , 2006, HRI '06.

[25]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[26]  Yuichiro Yoshikawa,et al.  Show, attend and interact: Perceivable human-robot social interaction through neural attention Q-network , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jian Huang,et al.  Human-Walking-Intention-Based Motion Control of an Omnidirectional-Type Cane Robot , 2013, IEEE/ASME Transactions on Mechatronics.