Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications

Dense Object Nets (DONs) by Florence, Manuelli and Tedrake (2018) introduced dense object descriptors as a novel visual object representation for the robotics community. It is suitable for many applications including object grasping, policy learning, etc. DONs map an RGB image depicting an object into a descriptor space image, which implicitly encodes key features of an object invariant to the relative camera pose. Impressively, the self-supervised training of DONs can be applied to arbitrary objects and can be evaluated and deployed within hours. However, the training approach relies on accurate depth images and faces challenges with small, reflective objects, typical for industrial settings, when using consumer grade depth cameras. In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs. We rely on Laplacian Eigenmaps (LE) to embed the 3D model of an object into an optimally generated space. While our approach uses more domain knowledge, it can be efficiently applied even for smaller and reflective objects, as it does not rely on depth information. We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-andplace performance in industry-relevant tasks.

[1]  Keenan Crane,et al.  Digital geometry processing with discrete exterior calculus , 2013, SIGGRAPH '13.

[2]  Sven Behnke,et al.  Refining 6D Object Pose Predictions using Abstract Render-and-Compare , 2019, 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids).

[3]  Russ Tedrake,et al.  Self-Supervised Correspondence in Visuomotor Policy Learning , 2019, IEEE Robotics and Automation Letters.

[4]  Raia Hadsell,et al.  S3K: Self-Supervised Semantic Keypoints for Robotic Manipulation via Multi-View Consistency , 2020, CoRL.

[5]  Russ Tedrake,et al.  Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning , 2020, CoRL.

[6]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[7]  Dieter Fox,et al.  Self-Supervised Visual Descriptor Learning for Dense Correspondence , 2017, IEEE Robotics and Automation Letters.

[8]  Michel Verleysen,et al.  Nonlinear dimensionality reduction of data manifolds with essential loops , 2005, Neurocomputing.

[9]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[10]  Slobodan Ilic,et al.  DPOD: 6D Pose Object Detector and Refiner , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Russ Tedrake,et al.  Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation , 2018, CoRL.

[12]  Keenan Crane,et al.  The Vector Heat Method , 2018, ACM Trans. Graph..

[13]  Leonidas J. Guibas,et al.  Global Intrinsic Symmetries of Shapes , 2008, Comput. Graph. Forum.

[14]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[17]  Silvio Savarese,et al.  Universal Correspondence Network , 2016, NIPS.

[18]  Hao Zhang,et al.  Spectral global intrinsic symmetry invariant functions , 2014, Graphics Interface.

[19]  B. Prabhakaran,et al.  Point-Based Manifold Harmonics , 2012, IEEE Transactions on Visualization and Computer Graphics.

[20]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[21]  Yi Li,et al.  DeepIM: Deep Iterative Matching for 6D Pose Estimation , 2018, International Journal of Computer Vision.

[22]  Priya Sundaresan,et al.  Learning Rope Manipulation Policies Using Dense Object Descriptors Trained on Synthetic Depth Data , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Keenan Crane,et al.  The heat method for distance computation , 2017, Commun. ACM.