Segmentation and Recovery of Superquadric Models using Convolutional Neural Networks

In this paper we address the problem of representing 3D visual data with parameterized volumetric shape primitives. Specifically, we present a (two-stage) approach built around convolutional neural networks (CNNs) capable of segmenting complex depth scenes into the simpler geometric structures that can be represented with superquadric models. In the first stage, our approach uses a Mask RCNN model to identify superquadric-like structures in depth scenes and then fits superquadric models to the segmented structures using a specially designed CNN regressor. Using our approach we are able to describe complex structures with a small number of interpretable parameters. We evaluated the proposed approach on synthetic as well as real-world depth data and show that our solution does not only result in competitive performance in comparison to the state-of-the-art, but is able to decompose scenes into a number of superquadric models at a fraction of the time required by competing approaches. We make all data and models used in the paper available from this https URL.

[1]  Sebastian Thrun,et al.  Towards fully autonomous driving: Systems and algorithms , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[2]  Daniel Cohen-Or,et al.  Generalized cylinder decomposition , 2015, ACM Trans. Graph..

[3]  Ruzena Bajcsy,et al.  Recovery of Parametric Models from Range Images: The Case for Superquadrics with Global Deformations , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  A.K. Jain,et al.  Obtaining generic parts from range images using a multi-view represen-tation , 1994 .

[5]  Franc Solina,et al.  Segmentation and Reconstruction of 3D Models from a Point Cloud with Deep Neural Networks , 2018, 2018 International Conference on Information and Communication Technology Convergence (ICTC).

[6]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Franc Solina,et al.  Recovery of Superquadrics from Range Images using Deep Learning: A Preliminary Study , 2019, 2019 IEEE International Work Conference on Bioinspired Intelligence (IWOBI).

[8]  Gareth Funka-Lea,et al.  Segmentation, Modeling And Classification Of The Compact Objects In A Pile , 1990, Other Conferences.

[9]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[10]  Jianxiong Xiao,et al.  A Linear Approach to Matching Cuboids in RGBD Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Andreas Geiger,et al.  Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Leonidas J. Guibas,et al.  Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Alex Pentland,et al.  Automatic extraction of deformable part models , 1990, International Journal of Computer Vision.

[15]  Franc Solina,et al.  Segmentation and Recovery of Superquadrics , 2000, Computational Imaging and Vision.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  R. Bajcsy,et al.  Three dimensional object representation revisited , 1987 .

[18]  Liam Pedersen,et al.  Science target assessment for Mars rover instrument deployment , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[20]  Ruzena Bajcsy,et al.  Volumetric segmentation of range images of 3D objects using superquadric models , 1993 .

[21]  Frank P. Ferrie,et al.  Darboux Frames, Snakes, and Super-Quadrics: Geometry from the Bottom Up , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[23]  Franc Solina,et al.  Superquadrics for Segmenting and Modeling Range Data , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Jun Li,et al.  Im2Struct: Recovering 3D Shape Structure from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[27]  Satoshi Suzuki,et al.  3D parts decomposition from sparse range data using information criterion , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Steve B. Jiang,et al.  Real-time volumetric image reconstruction and 3D tumor localization based on a single x-ray projection image for lung cancer radiotherapy. , 2010, Medical physics.