Visual Grounding of Learned Physical Models

Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions. The abilities to perform physical reasoning and to adapt to new environments, while intrinsic to humans, remain challenging to state-of-the-art computational models. In this work, we present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors. The visual prior predicts a particle-based representation of the system from visual observations. An inference module operates on those particles, predicting and refining estimates of particle locations, object states, and physical parameters, subject to the constraints imposed by the dynamics prior, which we refer to as visual grounding. We demonstrate the effectiveness of our method in environments involving rigid objects, deformable materials, and fluids. Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.

[1]  Joshua B. Tenenbaum,et al.  End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.

[2]  Jessica B. Hamrick,et al.  Inferring mass in complex scenes by mental simulation , 2016, Cognition.

[3]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[4]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[5]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[6]  Razvan Pascanu,et al.  Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[7]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[9]  Vladlen Koltun,et al.  Lagrangian Fluid Simulation with Continuous Convolutions , 2020, ICLR.

[10]  Connor Schenck,et al.  SPNets: Differentiable Fluid Dynamics for Deep Neural Networks , 2018, CoRL.

[11]  Chen Sun,et al.  Unsupervised Learning of Object Structure and Dynamics from Videos , 2019, NeurIPS.

[12]  Joshua B. Tenenbaum,et al.  Modeling human intuitions about liquid flow with particle-based simulation , 2018, PLoS Comput. Biol..

[13]  Sergey Levine,et al.  Stochastic Variational Video Prediction , 2017, ICLR.

[14]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[15]  Jonathan Ragan-Kelley,et al.  DiffTaichi: Differentiable Programming for Physical Simulation , 2019, ICLR.

[16]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[17]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[18]  Tae-Yong Kim,et al.  Unified particle physics for real-time applications , 2014, ACM Trans. Graph..

[19]  Chuang Gan,et al.  CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2020, ICLR.

[20]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[21]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Sergey Levine,et al.  Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.

[23]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[24]  Jonas Degrave,et al.  A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[27]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[28]  Eric A. Wan,et al.  MODEL PREDICTIVE NEURAL CONTROL OF A HIGH-FIDELITY HELICOPTER MODEL , 2001 .

[29]  Ming C. Lin,et al.  Differentiable Cloth Simulation for Inverse Problems , 2019, NeurIPS.

[30]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Jure Leskovec,et al.  Learning to Simulate Complex Physics with Graph Networks , 2020, ICML.

[32]  Miles Macklin,et al.  Position based fluids , 2013, ACM Trans. Graph..

[33]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[34]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[35]  Jiajun Wu,et al.  Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids , 2018, ICLR.

[36]  Daniel L. K. Yamins,et al.  Flexible Neural Representation for Physics Prediction , 2018, NeurIPS.

[37]  Jiajun Wu,et al.  Propagation Networks for Model-Based Control Under Partial Observation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[38]  Jiajun Wu,et al.  DensePhysNet: Learning Dense Physical Object Representations via Multi-step Dynamic Interactions , 2019, Robotics: Science and Systems.

[39]  Jiancheng Liu,et al.  ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[40]  Learning Compositional Koopman Operators for Model-Based Control , 2019, ICLR.

[41]  Jiajun Wu,et al.  Learning to See Physics via Visual De-animation , 2017, NIPS.

[42]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.