Visual Grounding of Learned Physical Models

Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions. The abilities to perform physical reasoning and to adapt to new environments, while intrinsic to humans, remain challenging to state-of-the-art computational models. In this work, we present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors. The visual prior predicts a particle-based representation of the system from visual observations. An inference module operates on those particles, predicting and refining estimates of particle locations, object states, and physical parameters, subject to the constraints imposed by the dynamics prior, which we refer to as visual grounding. We demonstrate the effectiveness of our method in environments involving rigid objects, deformable materials, and fluids. Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.

[1]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[2]  Vladlen Koltun,et al.  Lagrangian Fluid Simulation with Continuous Convolutions , 2020, ICLR.

[3]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[4]  Connor Schenck,et al.  SPNets: Differentiable Fluid Dynamics for Deep Neural Networks , 2018, CoRL.

[5]  Jiajun Wu,et al.  Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids , 2018, ICLR.

[6]  Jiajun Wu,et al.  Propagation Networks for Model-Based Control Under Partial Observation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[7]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[8]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[9]  Jiajun Wu,et al.  Learning Compositional Koopman Operators for Model-Based Control , 2020, ICLR.

[10]  Jonas Degrave,et al.  A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..

[11]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[12]  Jure Leskovec,et al.  Learning to Simulate Complex Physics with Graph Networks , 2020, ICML.

[13]  Frédo Durand,et al.  DiffTaichi: Differentiable Programming for Physical Simulation , 2020, ICLR.

[14]  Daniel L. K. Yamins,et al.  Flexible Neural Representation for Physics Prediction , 2018, NeurIPS.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[17]  Jessica B. Hamrick,et al.  Inferring mass in complex scenes by mental simulation , 2016, Cognition.

[18]  Eric A. Wan,et al.  MODEL PREDICTIVE NEURAL CONTROL OF A HIGH-FIDELITY HELICOPTER MODEL , 2001 .

[19]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[21]  Ming C. Lin,et al.  Differentiable Cloth Simulation for Inverse Problems , 2019, NeurIPS.

[22]  Chen Sun,et al.  Unsupervised Learning of Object Structure and Dynamics from Videos , 2019, NeurIPS.

[23]  Jiajun Wu,et al.  DensePhysNet: Learning Dense Physical Object Representations via Multi-step Dynamic Interactions , 2019, Robotics: Science and Systems.

[24]  Joshua B. Tenenbaum,et al.  Modeling human intuitions about liquid flow with particle-based simulation , 2018, PLoS Comput. Biol..

[25]  Miles Macklin,et al.  Position based fluids , 2013, ACM Trans. Graph..

[26]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[27]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[30]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[31]  Jiajun Wu,et al.  Learning to See Physics via Visual De-animation , 2017, NIPS.

[32]  Joshua B. Tenenbaum,et al.  End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.

[33]  Sergey Levine,et al.  Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.

[34]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[35]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[36]  Tae-Yong Kim,et al.  Unified particle physics for real-time applications , 2014, ACM Trans. Graph..

[37]  Chuang Gan,et al.  CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2020, ICLR.

[38]  Jiancheng Liu,et al.  ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[39]  Razvan Pascanu,et al.  Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[40]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[41]  Sergey Levine,et al.  Stochastic Variational Video Prediction , 2017, ICLR.

[42]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.