Object Rearrangement Using Learned Implicit Collision Functions

Robotic object rearrangement combines the skills of picking and placing objects. When object models are unavailable, typical collision-checking models may be unable to predict collisions in partial point clouds with occlusions, making generation of collision-free grasping or placement trajectories challenging. We propose a learned collision model that accepts scene and query object point clouds and predicts collisions for 6DOF object poses within the scene. We train the model on a synthetic set of 1 million scene/object point cloud pairs and 2 billion collision queries. We leverage the learned collision model as part of a model predictive path integral (MPPI) policy in a tabletop rearrangement task and show that the policy can plan collision-free grasps and placements for objects unseen in training in both simulated and physical cluttered scenes with a Franka Panda robot. The learned model outperforms both traditional pipelines and learned ablations by 9.8% in accuracy on a dataset of simulated collision queries and is 75x faster than the best-performing baseline. Videos and supplementary material are available at https://research.nvidia.com/publication/2021-03_Object-Rearrangement-Using.

[1]  Patrick Beeson,et al.  TRAC-IK: An open-source library for improved solving of generic inverse kinematics , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[2]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Peter R. Florence,et al.  Transporter Networks: Rearranging the Visual World for Robotic Manipulation , 2020, CoRL.

[4]  Dinesh Manocha,et al.  Probabilistic Collision Detection Between Noisy Point Clouds Using Robust Classification , 2011, ISRR.

[5]  Demetri Terzopoulos,et al.  Sampling and reconstruction with adaptive meshes , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  S. Sathiya Keerthi,et al.  A fast procedure for computing the distance between complex objects in three-dimensional space , 1988, IEEE J. Robotics Autom..

[7]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Dieter Fox,et al.  6-DOF GraspNet: Variational Grasp Generation for Object Manipulation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Robert Platt,et al.  Learning 6-DoF Grasping and Pick-Place Using Attention Focus , 2018, CoRL.

[13]  Dinesh Manocha,et al.  FCL: A general purpose library for collision and proximity queries , 2012, 2012 IEEE International Conference on Robotics and Automation.

[14]  Dinesh Manocha,et al.  Fast probabilistic collision checking for sampling-based motion planning using locality-sensitive hashing , 2016, Int. J. Robotics Res..

[15]  João Pereira,et al.  Collision Detection for Point Cloud Models With Bounding Spheres Hierarchies , 2012 .

[16]  Michael C. Yip,et al.  Learning-Based Proxy Collision Detection for Robot Motion Planning Applications , 2019, IEEE Transactions on Robotics.

[17]  Danica Kragic,et al.  Rearrangement with Nonprehensile Manipulation Using Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Gabriel Zachmann,et al.  Point Cloud Collision Detection , 2004, Comput. Graph. Forum.

[19]  Thomas Brox,et al.  3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation , 2016, MICCAI.

[20]  Siddhartha S. Srinivasa,et al.  Rearrangement planning using object-centric and robot-centric action spaces , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Eric Huang,et al.  Large-Scale Multi-Object Rearrangement , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[22]  Song Han,et al.  Point-Voxel CNN for Efficient 3D Deep Learning , 2019, NeurIPS.

[23]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[24]  Aleksandra Faust,et al.  Neural Collision Clearance Estimator for Fast Robot Motion Planning , 2019, ArXiv.

[25]  Dinesh Manocha,et al.  GPU-based parallel collision detection for fast motion planning , 2012, Int. J. Robotics Res..

[26]  Roozbeh Mottaghi,et al.  Rearrangement: A Challenge for Embodied AI , 2020, ArXiv.

[27]  Herbert Edelsbrunner,et al.  Three-dimensional alpha shapes , 1992, VVS.

[28]  Sanja Fidler,et al.  Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research , 2019, ArXiv.

[29]  Dieter Fox,et al.  GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning , 2018, CoRL.

[30]  Dieter Fox,et al.  6-DOF Grasping for Target-driven Object Manipulation in Clutter , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Siddhartha Srinivasa,et al.  Learning Configuration Space Belief Model from Collision Checks for Motion Planning , 2019 .

[32]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Dieter Fox,et al.  Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation , 2020, ArXiv.

[34]  Tuan Tran,et al.  Predicting Sample Collision with Neural Networks , 2020, ArXiv.

[35]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[36]  Dieter Fox,et al.  ACRONYM: A Large-Scale Grasp Dataset Based on Simulation , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Thomas Funkhouser,et al.  Grasping in the Wild: Learning 6DoF Closed-Loop Grasping From Low-Cost Demonstrations , 2020, IEEE Robotics and Automation Letters.

[39]  Bilge Mutlu,et al.  RelaxedIK: Real-time Synthesis of Accurate and Feasible Robot Arm Motion , 2018, Robotics: Science and Systems.

[40]  Evangelos A. Theodorou,et al.  Model Predictive Path Integral Control: From Theory to Parallel Computation , 2017 .

[41]  Chandrajit L. Bajaj,et al.  Automatic reconstruction of surfaces and scalar fields from 3D scans , 1995, SIGGRAPH.

[42]  Jürgen Leitner,et al.  Learning robust, real-time, reactive robotic grasping , 2019, Int. J. Robotics Res..

[43]  Matthias Nießner,et al.  Scan2Mesh: From Unstructured Range Scans to 3D Meshes , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Philip M. Hubbard,et al.  Approximating polyhedra with spheres for time-critical collision detection , 1996, TOGS.

[45]  Danica Kragic,et al.  Learning Manipulation States and Actions for Efficient Non-prehensile Rearrangement Planning , 2019, ArXiv.

[46]  Eddy Ilg,et al.  Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction , 2020, ECCV.

[47]  Thomas Funkhouser,et al.  Local Implicit Grid Representations for 3D Scenes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Pierre Alliez,et al.  A Survey of Surface Reconstruction from Point Clouds , 2017, Comput. Graph. Forum.

[49]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[50]  Tucker Hermans,et al.  Learning Continuous 3D Reconstructions for Geometrically Aware Grasping , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[51]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).