Learning to Poke by Poking: Experiential Learning of Intuitive Physics

We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics. Our model is evaluated on a real-world robotic manipulation task that requires displacing objects to target locations by poking. The robot gathered over 400 hours of experience by executing more than 100K pokes on different objects. We propose a novel approach based on deep neural networks for modeling the dynamics of robot's interactions directly from images, by jointly estimating forward and inverse models of dynamics. The inverse model objective provides supervision to construct informative visual features, which the forward model can then predict and in turn regularize the feature space for the inverse model. The interplay between these two objectives creates useful, accurate models that can then be used for multi-step decision making. This formulation has the additional benefit that it is possible to learn forward models in an abstract feature space and thus alleviate the need of predicting pixels. Our experiments show that this joint modeling approach outperforms alternative methods.

[1]  R. C. Oldfield THE PERCEPTION OF CAUSALITY , 1963 .

[2]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[3]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[4]  A. Gopnik,et al.  The scientist in the crib : minds, brains, and how children learn , 1999 .

[5]  Michael Gasser,et al.  The Development of Embodied Cognition: Six Lessons from Babies , 2005, Artificial Life.

[6]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[7]  Martin A. Riedmiller,et al.  The Neuro Slot Car Racer: Reinforcement Learning in a Real World Setting , 2009, 2009 International Conference on Machine Learning and Applications.

[8]  Martin A. Riedmiller,et al.  Deep learning of visual control policies , 2010, ESANN.

[9]  Jessica B. Hamrick,et al.  Probabilistic internal physics models guide judgments about object dynamics , 2011, CogSci.

[10]  Jessica B. Hamrick Internal physics models guide probabilistic judgments about object dynamics , 2011 .

[11]  Takeo Igarashi,et al.  Automatic learning of pushing strategy for delivery of irregular-shaped objects , 2011, 2011 IEEE International Conference on Robotics and Automation.

[12]  Rustam Stolkin,et al.  Learning to predict how rigid objects behave under simple manipulation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[13]  Siddhartha S. Srinivasa,et al.  A Planning Framework for Non-Prehensile Manipulation under Clutter and Uncertainty , 2012, Autonomous Robots.

[14]  Martin A. Riedmiller,et al.  Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  David Q. Mayne,et al.  Model predictive control: Recent developments and future promise , 2014, Autom..

[17]  Antonio Torralba,et al.  Anticipating the future by watching unlabeled video , 2015, ArXiv.

[18]  Manuela M. Veloso,et al.  Push-manipulation of complex passive mobile objects using experimentally acquired motion models , 2015, Auton. Robots.

[19]  Thomas B. Schön,et al.  From Pixels to Torques: Policy Learning with Deep Dynamical Models , 2015, ICML 2015.

[20]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[21]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[22]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[23]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[24]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[25]  Emanuel Todorov,et al.  Physically consistent state estimation and system identification for contacts , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[26]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[27]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[28]  Ali Farhadi,et al.  Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[30]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[31]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Abhinav Gupta,et al.  The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.

[33]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[34]  DarrellTrevor,et al.  End-to-end training of deep visuomotor policies , 2016 .

[35]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..