论文信息 - Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Single RGB Image

Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Single RGB Image

Humans perceive the 3D world as a set of distinct objects that are characterized by various low-level (geometry, reflectance) and high-level (connectivity, adjacency, symmetry) properties. Recent methods based on convolutional neural networks (CNNs) demonstrated impressive progress in 3D reconstruction, even when using a single 2D image as input. However, the majority of these methods focuses on recovering the local 3D geometry of an object without considering its part-based decomposition or relations between parts. We address this challenging problem by proposing a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives as well as their latent hierarchical structure without part-level supervision. Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives, where simple parts are represented with fewer primitives and more complex parts are modeled with more components. Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.

[1] Ming-Yu Liu,et al. PointFlow: 3D Point Cloud Generation With Continuous Normalizing Flows , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2] Max Jaderberg,et al. Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[3] Leonidas J. Guibas,et al. KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4] Jun Li,et al. Symmetry Hierarchy of Man‐Made Objects , 2011, Comput. Graph. Forum.

[5] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[6] Michael J. Black,et al. Dynamic FAUST: Registering Human Bodies in Motion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Anders P. Eriksson,et al. Implicit Surface Representations As Layers in Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8] Luc Van Gool,et al. RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[10] Irving Biederman,et al. Human image understanding: Recent research and a theory , 1985, Comput. Vis. Graph. Image Process..

[11] Mathieu Aubry,et al. Learning elementary structures for 3D shape generation and matching , 2019, NeurIPS.

[12] R. Zemel,et al. Neural Relational Inference for Interacting Systems , 2018, ICML.

[13] Hao Zhang,et al. Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] David H. Laidlaw,et al. Constructive solid geometry for polyhedral objects , 1986, SIGGRAPH.

[15] Sebastian Nowozin,et al. Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Jun Li,et al. Im2Struct: Recovering 3D Shape Structure from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] R. Baillargeon. Infants' Physical World , 2004 .

[18] Gernot Riegler,et al. OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Silvio Savarese,et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[20] Andrea Tagliasacchi,et al. CvxNet: Learnable Convex Decomposition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Hao Su,et al. A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Jiajun Wu,et al. Learning to Infer and Execute 3D Shape Programs , 2019, ICLR.

[23] Katherine D. Kinzler,et al. Core systems in human cognition. , 2007, Progress in brain research.

[24] Leonidas J. Guibas,et al. Supervised Fitting of Geometric Primitives to 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Ersin Yumer,et al. 3D-PRNN: Generating Shape Primitives with Recurrent Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27] Shengping Zhang,et al. Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Daniel Cohen-Or,et al. Structure-aware shape processing , 2013, Eurographics.

[29] Armando Solar-Lezama,et al. Learning to Infer Graphics Programs from Hand-Drawn Images , 2017, NeurIPS.

[30] Leonidas J. Guibas,et al. StructureNet , 2019, ACM Trans. Graph..

[31] Subhransu Maji,et al. 3D Shape Induction from 2D Views of Multiple Objects , 2016, 2017 International Conference on 3D Vision (3DV).

[32] Hao Li,et al. PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[34] Jürgen Schmidhuber,et al. Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[35] Thomas A. Funkhouser,et al. Learning Shape Templates With Structured Implicit Functions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36] William Rowan Hamilton,et al. ON QUATERNIONS, OR ON A NEW SYSTEM OF IMAGINARIES IN ALGEBRA , 1847 .

[37] Chen Sun,et al. Unsupervised Discovery of Parts, Structure, and Dynamics , 2019, ICLR.

[38] Donald D. Hoffman,et al. Parts of recognition , 1984, Cognition.

[39] Jitendra Malik,et al. Learning a Multi-View Stereo Machine , 2017, NIPS.

[40] Leonidas J. Guibas,et al. Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[41] Jürgen Schmidhuber,et al. Neural Expectation Maximization , 2017, NIPS.

[42] Yiyi Liao,et al. Deep Marching Cubes: Learning Explicit Surface Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43] Jitendra Malik,et al. Learning Category-Specific Mesh Reconstruction from Image Collections , 2018, ECCV.

[44] Andreas Geiger,et al. Learning Non-Volumetric Depth Fusion Using Successive Reprojections , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Theodore Lim,et al. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[46] Xiaoguang Han,et al. Deep Mesh Reconstruction From Single RGB Images via Topology Modification Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47] Fangqiao Hu,et al. Learning Structural Graph Layouts and 3D Shapes for Long Span Bridges 3D Reconstruction , 2019, ArXiv.

[48] Mathieu Aubry,et al. AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[49] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50] Xiaojuan Qi,et al. GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction , 2018, ECCV.

[51] Leonidas J. Guibas,et al. GRASS: Generative Recursive Autoencoders for Shape Structures , 2017, ACM Trans. Graph..

[52] Luc Van Gool,et al. Learned Multi-patch Similarity , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53] Wei Liu,et al. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[54] Jiajun Wu,et al. Learning to Describe Scenes with Programs , 2018, ICLR.

[55] Andreas Geiger,et al. Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Leonidas J. Guibas,et al. Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57] Duygu Ceylan,et al. DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction , 2019, NeurIPS.

[58] Andreas Geiger,et al. Learning 3D Shape Completion from Laser Scan Data with Weak Supervision , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59] William Rowan Hamilton,et al. XI. On quaternions; or on a new system of imaginaries in algebra , 1848 .

[60] Subhransu Maji,et al. CSGNet: Neural Shape Parser for Constructive Solid Geometry , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).