ZePHyR: Zero-shot Pose Hypothesis Rating

Pose estimation is a basic module in many robot manipulation pipelines. Estimating the pose of objects in the environment can be useful for grasping, motion planning, or manipulation. However, current state-of-the-art methods for pose estimation either rely on large annotated training sets or simulated data. Further, the long training times for these methods prohibit quick interaction with novel objects. To address these issues, we introduce a novel method for zero-shot object pose estimation in clutter. Our approach uses a hypothesis generation and scoring framework, with a focus on learning a scoring function that generalizes to objects not used for training. We achieve zero-shot generalization by rating hypotheses as a function of unordered point differences. We evaluate our method on challenging datasets with both textured and untextured objects in cluttered scenes and demonstrate that our method significantly outperforms previous methods on this task. We also demonstrate how our system can be used by quickly scanning and building a model of a novel object, which can immediately be used by our method for pose estimation. Our work allows users to estimate the pose of novel objects without requiring any retraining. Additional information can be found on our website https://bokorn.github.io/zephyr/

[1]  Eric Brachmann,et al.  Learning 6D Object Pose Estimation Using 3D Object Coordinates , 2014, ECCV.

[2]  Swarat Chaudhuri,et al.  Incremental Task and Motion Planning: A Constraint-Based Approach , 2016, Robotics: Science and Systems.

[3]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[4]  Gérard G. Medioni,et al.  3D object recognition in range images using visibility context , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[7]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Bertram Drost,et al.  3D Object Detection and Localization Using Multimodal Point Pair Features , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[9]  Zoltan-Csaba Marton,et al.  Multi-Path Learning for Object Pose Estimation Across Domains , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Mathieu Aubry,et al.  CosyPose: Consistent multi-view multi-object 6D pose estimation , 2020, ECCV.

[11]  Wei Gao,et al.  kPAM: KeyPoint Affordances for Category-Level Robotic Manipulation , 2019, ISRR.

[12]  Vladlen Koltun,et al.  Dense scene reconstruction with points of interest , 2013, ACM Trans. Graph..

[13]  D. Fox,et al.  The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation , 2019, CoRL.

[14]  Mingui Sun,et al.  Robust Robot Pose Estimation for Challenging Scenes With an RGB-D Camera , 2019, IEEE Sensors Journal.

[15]  Maxim Likhachev,et al.  Planning for grasp selection of partially occluded objects , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Alessio Del Bue,et al.  Fast 6D pose estimation for texture-less objects from a single RGB image , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Maxim Likhachev,et al.  PERCH: Perception via search for multi-object recognition and localization , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Silvio Savarese,et al.  DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Leonidas J. Guibas,et al.  Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Timothy Patten,et al.  Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Vincent Lepetit,et al.  Gradient Response Maps for Real-Time Detection of Textureless Objects , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Markus Ulrich,et al.  Combining Scale-Space and Similarity-Based Aspect Graphs for Fast 3D Object Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Nassir Navab,et al.  Model globally, match locally: Efficient and robust 3D object recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Vibhav Vineet,et al.  Photorealistic Image Synthesis for Object Instance Detection , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[26]  Dieter Fox,et al.  PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes , 2017, Robotics: Science and Systems.

[27]  Dieter Fox,et al.  LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Vincent Lepetit,et al.  Going Further with Point Pair Features , 2016, ECCV.

[29]  Alessio Del Bue,et al.  Fast 6D pose from a single RGB image using Cascaded Forests Templates , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Timothy Bretl,et al.  PoseRBPF: A Rao-Blackwellized Particle Filter for6D Object Pose Estimation , 2019, Robotics: Science and Systems.

[31]  Zheng Guo,et al.  A Fast Global Method Combined with Local Features for 6D Object Pose Estimation , 2019, 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM).

[32]  Martial Hebert,et al.  Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[34]  Matei T. Ciocarlie,et al.  Towards Reliable Grasping and Manipulation in Household Environments , 2010, ISER.

[35]  Yi Li,et al.  DeepIM: Deep Iterative Matching for 6D Pose Estimation , 2018, International Journal of Computer Vision.

[36]  Siddhartha S. Srinivasa,et al.  CHOMP: Covariant Hamiltonian optimization for motion planning , 2013, Int. J. Robotics Res..

[37]  Pieter Abbeel,et al.  Learning Robotic Assembly from CAD , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Fredrik Kahl,et al.  Pose Proposal Critic: Robust Pose Refinement by Learning Reprojection Errors , 2020, BMVC.

[39]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[40]  Eric Brachmann,et al.  BOP: Benchmark for 6D Object Pose Estimation , 2018, ECCV.

[41]  Matei T. Ciocarlie,et al.  The Columbia grasp database , 2009, 2009 IEEE International Conference on Robotics and Automation.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Xavier Lladó,et al.  A Method for 6D Pose Estimation of Free-Form Rigid Objects Using Point Pair Features on Range Data , 2018, Sensors.

[44]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[45]  Siddhartha S. Srinivasa,et al.  The MOPED framework: Object recognition and pose estimation for manipulation , 2011, Int. J. Robotics Res..

[46]  Dieter Fox,et al.  Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects , 2018, CoRL.

[47]  Mathieu Aubry,et al.  Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects , 2019, BMVC.

[48]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[49]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[50]  Zoltan-Csaba Marton,et al.  Augmented Autoencoders: Implicit 3D Orientation Learning for 6D Object Detection , 2019, International Journal of Computer Vision.

[51]  Stanley T. Birchfield,et al.  Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[52]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[53]  Antonio Torralba,et al.  FPM: Fine Pose Parts-Based Model with 3D CAD Models , 2014, ECCV.

[54]  Vincent Lepetit,et al.  Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes , 2012, ACCV.