A Multi-Hypothesis Approach to Pose Ambiguity in Object-Based SLAM

In object-based Simultaneous Localization and Mapping (SLAM), 6D object poses offer a compact representation of landmark geometry useful for downstream planning and manipulation tasks. However, measurement ambiguity then arises as objects may possess complete or partial object shape symmetries (e.g., due to occlusion), making it difficult or impossible to generate a single consistent object pose estimate. One idea is to generate multiple pose candidates to counteract measurement ambiguity. In this paper, we develop a novel approach that enables an object-based SLAM system to reason about multiple pose hypotheses for an object, and synthesize this locally ambiguous information into a globally consistent robot and landmark pose estimation formulation. In particular, we (1) present a learned pose estimation network that provides multiple hypotheses about the 6D pose of an object; (2) by treating the output of our network as components of a mixture model, we incorporate pose predictions into a SLAM system, which, over successive observations, recovers a globally consistent set of robot and object (landmark) pose estimates. We evaluate our approach on the popular YCB-Video Dataset and a simulated video featuring YCB objects. Experiments demonstrate that our approach is effective in improving the robustness of object-based SLAM in the face of object pose ambiguity. 1

[1]  Sebastian Thrun,et al.  Simultaneous localization and mapping with unknown data association using FastSLAM , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[2]  Michael Milford,et al.  QuadricSLAM: Dual Quadrics From Object Detections as Landmarks in Object-Oriented SLAM , 2018, IEEE Robotics and Automation Letters.

[3]  Paul Timothy Furgale,et al.  Associating Uncertainty With Three-Dimensional Poses for Use in Estimation Problems , 2014, IEEE Transactions on Robotics.

[4]  Edwin Olson,et al.  Inference on networks of mixtures for robust robot mapping , 2013, Int. J. Robotics Res..

[5]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[6]  Siddhartha S. Srinivasa,et al.  The MOPED framework: Object recognition and pose estimation for manipulation , 2011, Int. J. Robotics Res..

[7]  Paul H. J. Kelly,et al.  SLAM++: Simultaneous Localisation and Mapping at the Level of Objects , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Silvio Savarese,et al.  DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Dieter Fox,et al.  Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects , 2018, CoRL.

[10]  John J. Leonard,et al.  Advances in Inference and Representation for Simultaneous Localization and Mapping , 2021, Annu. Rev. Control. Robotics Auton. Syst..

[11]  Bolei Zhou,et al.  SegICP: Integrated deep semantic segmentation and pose estimation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Shichao Yang,et al.  CubeSLAM: Monocular 3-D Object SLAM , 2018, IEEE Transactions on Robotics.

[13]  Hongdong Li,et al.  Rotation Averaging , 2013, International Journal of Computer Vision.

[14]  F. Dellaert Factor Graphs and GTSAM: A Hands-on Introduction , 2012 .

[15]  Roland Siegwart,et al.  Collaborative 3D Reconstruction Using Heterogeneous UAVs: System and Experiments , 2016, ISER.

[16]  Dieter Fox,et al.  PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes , 2017, Robotics: Science and Systems.

[17]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Frank Dellaert,et al.  iSAM2: Incremental smoothing and mapping using the Bayes tree , 2012, Int. J. Robotics Res..

[19]  Kostas Daniilidis,et al.  Single image 3D object detection and pose estimation for grasping , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[21]  Vincent Lepetit,et al.  Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes , 2012, ACCV.

[22]  Yue Wang,et al.  PRNet: Self-Supervised Learning for Partial-to-Partial Registration , 2019, NeurIPS.

[23]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[24]  Pushmeet Kohli,et al.  Multiple Choice Learning: Learning to Produce Multiple Structured Outputs , 2012, NIPS.

[25]  Yue Wang,et al.  Deep Closest Point: Learning Representations for Point Cloud Registration , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[27]  Alexei A. Efros,et al.  Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  D. Hähnel,et al.  Probabilistic Matching for 3D Scan Registration , 2002 .

[29]  Roland Siegwart,et al.  Point Clouds Registration with Probabilistic Data Association , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Niko Sünderhauf,et al.  Switchable constraints for robust pose graph SLAM , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  David Baxter,et al.  Probabilistic Data Association via Mixture Models for Robust Semantic SLAM , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Timothy Bretl,et al.  PoseRBPF: A Rao–Blackwellized Particle Filter for 6-D Object Pose Tracking , 2019, IEEE Transactions on Robotics.

[33]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Nassir Navab,et al.  SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Niko Sünderhauf,et al.  Towards a robust back-end for pose graph SLAM , 2012, 2012 IEEE International Conference on Robotics and Automation.