论文信息 - Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. Videos and supplementary materials available at this https URL

[1] Dieter Fox,et al. Self-Supervised Visual Descriptor Learning for Dense Correspondence , 2017, IEEE Robotics and Automation Letters.

[2] Evangelos Theodorou,et al. Model Predictive Path Integral Control using Covariance Variable Importance Sampling , 2015, ArXiv.

[3] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[4] Vijay Kumar,et al. Trajectory generation and control for precise aggressive maneuvers with quadrotors , 2012, Int. J. Robotics Res..

[5] Ankush Gupta,et al. Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.

[6] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[7] Russ Tedrake,et al. Self-Supervised Correspondence in Visuomotor Policy Learning , 2019, IEEE Robotics and Automation Letters.

[8] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[9] Matthew T. Mason,et al. Pushing revisited: Differential flatness, trajectory planning, and stabilization , 2019, Int. J. Robotics Res..

[10] Yen-Chen Lin,et al. Experience-Embedded Visual Foresight , 2019, CoRL.

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12] Sergey Levine,et al. Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.

[13] Russ Tedrake,et al. Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation , 2018, CoRL.

[14] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[15] Wei Gao,et al. kPAM: KeyPoint Affordances for Category-Level Robotic Manipulation , 2019, ISRR.

[16] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[17] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[19] Russ Tedrake,et al. Robust post-stall perching with a simple fixed-wing glider using LQR-Trees , 2014, Bioinspiration & biomimetics.

[20] P. Abbeel,et al. Yale-CMU-Berkeley dataset for robotic manipulation research , 2017, Int. J. Robotics Res..

[21] Maria Bauzá,et al. A Data-Efficient Approach to Precise and Controlled Pushing , 2018, CoRL.

[22] Sergey Levine,et al. Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[23] Russ Tedrake,et al. The Surprising Effectiveness of Linear Models for Visual Foresight in Object Pile Manipulation , 2020, WAFR.

[24] Pieter Abbeel,et al. Learning Predictive Representations for Deformable Objects Using Contrastive Estimation , 2020, CoRL.

[25] Pieter Abbeel,et al. Visual Imitation Made Easy , 2020, CoRL.

[26] Alberto Rodriguez,et al. Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27] Sergey Levine,et al. Deep Dynamics Models for Learning Dexterous Manipulation , 2019, CoRL.

[28] Alberto Rodriguez,et al. TossingBot: Learning to Throw Arbitrary Objects With Residual Physics , 2019, IEEE Transactions on Robotics.

[29] Alberto Rodriguez,et al. Feedback Control of the Pusher-Slider System: A Story of Hybrid and Underactuated Contact Dynamics , 2016, WAFR.

[30] Pieter Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.

[31] Sergey Levine,et al. Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.