论文信息 - Propagation Networks for Model-Based Control Under Partial Observation

Propagation Networks for Model-Based Control Under Partial Observation

There has been an increasing interest in learning dynamics simulators for model-based control. Compared with off-the-shelf physics engines, a learnable simulator can quickly adapt to unseen objects, scenes, and tasks. However, existing models like interaction networks only work for fully observable systems; they also only consider pairwise interactions within a single time step, both restricting their use in practical systems. We introduce Propagation Networks (PropNet), a differentiable, learnable dynamics model that handles partially observable scenarios and enables instantaneous propagation of signals beyond pairwise interactions. With these innovations, our propagation networks not only outperform current learnable physics engines in forward simulation, but also achieves superior performance on various control tasks. Compared with existing deep reinforcement learning algorithms, model-based control with propagation networks is more accurate, efficient, and generalizable to novel, partially observable scenes and tasks.

[1] Robert C. Bolles,et al. Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[2] Nima Fazeli,et al. Fundamental Limitations in Performance and Interpretability of Common Planar Rigid-Body Contact Models , 2017, ISRR.

[3] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[4] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5] Joshua B. Tenenbaum,et al. A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[6] Jonas Degrave,et al. A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..

[7] Shimon Whiteson,et al. TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning , 2017, ICLR 2018.

[8] Russ Tedrake,et al. Underactuated Robotics: Learning, Planning, and Control for Ecient and Agile Machines Course Notes for MIT 6.832 , 2009 .

[9] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[10] Richard Socher,et al. Quasi-Recurrent Neural Networks , 2016, ICLR.

[11] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[12] Joshua B. Tenenbaum,et al. End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.

[13] Razvan Pascanu,et al. Learning model-based planning from scratch , 2017, ArXiv.

[14] Yu Zhang,et al. Simple Recurrent Units for Highly Parallelizable Recurrence , 2017, EMNLP.

[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Alberto Rodriguez,et al. Experimental Validation of Contact Dynamics for In-Hand Manipulation , 2016, ISER.

[17] Carlos Bordons Alba,et al. Model Predictive Control , 2012 .

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[20] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[21] Razvan Pascanu,et al. Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.

[22] Marc Toussaint,et al. Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning , 2018, Robotics: Science and Systems.

[23] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[24] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[25] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.

[26] Samuel S. Schoenholz,et al. Neural Message Passing for Quantum Chemistry , 2017, ICML.

[27] Yu Zhang,et al. Training RNNs as Fast as CNNs , 2017, EMNLP 2018.

[28] Kuan-Ting Yu,et al. More than a million ways to be pushed. A high-fidelity experimental dataset of planar pushing , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29] David E. Stewart,et al. Rigid-Body Dynamics with Friction and Impact , 2000, SIAM Rev..

[30] Raia Hadsell,et al. Graph networks as learnable physics engines for inference and control , 2018, ICML.

[31] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.

[32] Niloy J. Mitra,et al. Taking Visual Motion Prediction To New Heightfields , 2019, Comput. Vis. Image Underst..

[33] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.