Improving Human Intention Prediction Using Data Augmentation

One of the crucial challenges in human-robot interaction is how to enable robots to predict human intentions. In this study, we explore how data augmentation technique can contribute to human intention prediction when only limited training data is available. Specifically, we conduct experiments of predicting the intentions of a human throwing a ball towards designated targets. Prediction performances with various data augmentation methods are presented and compared. The experiment results show that prediction accuracy can be improved from 50% to 75%.

[1]  Francesca Cordella,et al.  Learning by Demonstration for Planning Activities of Daily Living in Rehabilitation and Assistive Robotics , 2017, IEEE Robotics and Automation Letters.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Andrew Zisserman,et al.  Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  David Wingate,et al.  Estimating Human Intent for Physical Human-Robot Co-Manipulation , 2017, ArXiv.

[6]  Lin Zhang,et al.  A Preliminary Study on a Robot's Prediction of Human Intention , 2017, 2017 IEEE 7th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER).

[7]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[8]  Mark D. McDonnell,et al.  Understanding Data Augmentation for Classification: When to Warp? , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[9]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[11]  Cordelia Schmid,et al.  Expanded Parts Model for Human Attribute and Action Recognition in Still Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Nikolaos Papanikolopoulos,et al.  Robot Surveillance and Security , 2016, Springer Handbook of Robotics, 2nd Ed..

[13]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[14]  Ivan Laptev,et al.  Learning person-object interactions for action recognition in still images , 2011, NIPS.

[15]  Thomas Linner,et al.  Construction Robots: Elementary Technologies and Single-Task Construction Robots , 2016 .

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Brian Scassellati,et al.  A thermal emotion classifier for improved human-robot interaction , 2016, 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[18]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[19]  Tao Mei,et al.  Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Russell H. Taylor,et al.  Medical robotics in computer-integrated surgery , 2003, IEEE Trans. Robotics Autom..

[23]  Russell H. Taylor,et al.  Medical robotics in computer-integrated surgery , 2003, IEEE Trans. Robotics Autom..

[24]  Alexander Verl,et al.  Cooperation of human and machines in assembly lines , 2009 .

[25]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[26]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Bernhard Schölkopf,et al.  Anticipatory action selection for human-robot table tennis , 2017, Artif. Intell..

[28]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[29]  Hema Swetha Koppula,et al.  Anticipatory Planning for Human-Robot Teams , 2014, ISER.

[30]  Xue Li,et al.  Action recognition in still images using a combination of human pose and context information , 2012, 2012 19th IEEE International Conference on Image Processing.