论文信息 - Learning visual policies for building 3D shape categories

Learning visual policies for building 3D shape categories

Manipulation and assembly tasks require non-trivial planning of actions depending on the environment and the final goal. Previous work in this domain often assembles particular instances of objects from known sets of primitives. In contrast, we here aim to handle varying sets of primitives and to construct different objects of the same shape category. Given a single object instance of a category, e.g. an arch, and a binary shape classifier, we learn a visual policy to assemble other instances of the same category. In particular, we propose a disassembly procedure and learn a state policy that discovers new object instances and their assembly plans in state space. We then render simulated states in the observation space and learn a heatmap representation to predict alternative actions from a given input image. To validate our approach, we first demonstrate its efficiency for building object categories in state space. We then show the success of our visual policies for building arches from different primitives. Moreover, we demonstrate (i) the reactive ability of our method to re-assemble objects using additional primitives and (ii) the robust performance of our policy for unseen primitives resembling building blocks used during training. Our visual assembly policies are trained with no real images and reach up to 95% success rate when evaluated on a real robot.

[1] Leslie Pack Kaelbling,et al. A constraint-based method for solving sequential manipulation planning problems , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[3] Shuran Song,et al. Form2Fit: Learning Shape Priors for Generalizable Assembly from Disassembly , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[4] Abhinav Gupta,et al. Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5] Sergey Levine,et al. Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.

[6] Silvio Savarese,et al. Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Traian Rebedea,et al. Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.

[8] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[10] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[11] Ken Goldberg,et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[12] Vincent Lepetit,et al. 3D Pose Estimation and 3D Model Retrieval for Objects in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[14] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15] Varun Ramakrishna,et al. Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Steven M. LaValle,et al. Planning algorithms , 2006 .

[18] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[19] Sergey Levine,et al. Time Reversal as Self-Supervision , 2018, ArXiv.

[20] Pieter Abbeel,et al. Learning Robotic Manipulation through Visual Planning and Acting , 2019, Robotics: Science and Systems.

[21] Xinlei Chen,et al. Order-Aware Generative Modeling Using the 3D-Craft Dataset , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[23] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .

[24] Xinyu Liu,et al. Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[25] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[26] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[27] Lydia E. Kavraki,et al. The Open Motion Planning Library , 2012, IEEE Robotics & Automation Magazine.

[28] Leslie Pack Kaelbling,et al. FFRob: An Efficient Heuristic for Task and Motion Planning , 2015, WAFR.

[29] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.

[31] Rouhollah Rahmatizadeh,et al. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[32] Sergey Levine,et al. Deep Reinforcement Learning for Robotic Manipulation , 2016, ArXiv.

[33] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.

[34] Cordelia Schmid,et al. Learning to Augment Synthetic Images for Sim2Real Policy Transfer , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[36] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.

[37] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[38] Martin A. Riedmiller,et al. Acquiring visual servoing reaching and grasping skills using neural reinforcement learning , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[39] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.

[40] Leslie Pack Kaelbling,et al. FFRob: Leveraging symbolic planning for efficient task and motion planning , 2016, Int. J. Robotics Res..

[41] Swarat Chaudhuri,et al. An incremental constraint-based framework for task and motion planning , 2018, Int. J. Robotics Res..

[42] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[43] Tsuyoshi Murata,et al. {m , 1934, ACML.

[44] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.

[45] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[46] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Lydia E. Kavraki,et al. Towards manipulation planning with temporal logic specifications , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[48] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[49] Xian Zhou,et al. Can robots assemble an IKEA chair? , 2018, Science Robotics.

[50] Marc Toussaint,et al. Logic-Geometric Programming: An Optimization-Based Approach to Combined Task and Motion Planning , 2015, IJCAI.

[51] Pieter Abbeel,et al. Combined task and motion planning through an extensible planner-independent interface layer , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[52] Cordelia Schmid,et al. Modulated Policy Hierarchies , 2018, ArXiv.

[53] Alberto Rodriguez,et al. TossingBot: Learning to Throw Arbitrary Objects With Residual Physics , 2019, IEEE Transactions on Robotics.

[54] Patrick Rives,et al. A new approach to visual servoing in robotics , 1992, IEEE Trans. Robotics Autom..

[55] Sergey Levine,et al. Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.

[56] Gregory D. Hager,et al. Visual Robot Task Planning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[57] Leslie Pack Kaelbling,et al. Sampling-based methods for factored task and motion planning , 2018, Int. J. Robotics Res..