Anticipating Many Futures: Online Human Motion Prediction and Generation for Human-Robot Interaction

Fluent and safe interactions of humans and robots require both partners to anticipate the others' actions. The bottleneck of most methods is the lack of an accurate model of natural human motion. In this work, we present a conditional variational autoencoder that is trained to predict a window of future human motion given a window of past frames. Using skeletal data obtained from RGB depth images, we show how this unsupervised approach can be used for online motion prediction for up to 1660 ms. Additionally, we demonstrate online target prediction within the first 300–500 ms after motion onset without the use of target specific training data. The advantage of our probabilistic approach is the possibility to draw samples of possible future motion patterns. Finally, we investigate how movements and kinematic cues are represented on the learned low dimensional manifold.

[1]  Julie A. Shah,et al.  Fast target prediction of human reaching motion for cooperative human-robot manipulation tasks using time series classification , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[2]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[3]  Zhengyou Zhang,et al.  Microsoft Kinect Sensor and Its Effect , 2012, IEEE Multim..

[4]  Jitendra Malik,et al.  Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  M. Candidi,et al.  Kinematics fingerprints of leader and follower role-taking during cooperative joint actions , 2013, Experimental Brain Research.

[6]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[7]  Jan Peters,et al.  Anticipative Interaction Primitives for Human-Robot Collaboration , 2016, AAAI Fall Symposia.

[8]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Michael J. Black,et al.  On Human Motion Prediction Using Recurrent Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Oliver Kroemer,et al.  Interaction primitives for human-robot cooperation tasks , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Dmitry Berenson,et al.  A framework for unsupervised online human reaching motion recognition and early prediction , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Siddhartha S. Srinivasa,et al.  Legibility and predictability of robot motion , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[14]  L. Wheaton,et al.  I give you a cup, I get a cup: A kinematic study on social intention , 2014, Neuropsychologia.

[15]  Danica Kragic,et al.  Deep Representation Learning for Human Motion Prediction and Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Daan Wierstra,et al.  One-Shot Generalization in Deep Generative Models , 2016, ICML.

[17]  Mitsuo Kawato,et al.  MOSAIC Model for Sensorimotor Learning and Control , 2001, Neural Computation.

[18]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[19]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[20]  U. Castiello,et al.  Does the intention to communicate affect action kinematics? , 2009, Consciousness and Cognition.

[21]  Giovanni Pezzulo,et al.  What should I do next? Using shared representations to solve interaction problems , 2011, Experimental Brain Research.

[22]  Hema Swetha Koppula,et al.  Anticipatory Planning for Human-Robot Teams , 2014, ISER.

[23]  Sergey Levine,et al.  Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[24]  Dmitry Berenson,et al.  Human-robot collaborative manipulation planning using early prediction of human motion , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.