Annotating 3D Models and Their Parts via Deep Feature Embedding

Need to organize 3D shape data has prompted studies on comparison and retrieval of 3D shape models. Being able to query 3D shape models by words, in addition to 3D model examples and 2D sketches, would be quite beneficial. This paper proposes a method to associate whole 3D models (e.g., automobile) as well as their parts (e.g., tire, body, engine) with word labels so that the 3D model can be queried by words. The associations between 3D shapes and words are learned from a dataset of 3D models whose whole model and segmented parts are labeled with words. Feature vectors of these words (distributed representation) and feature vectors of whole and partial geometries of 3D models are embedded, by Word Shape embedding Network (WSN) into a common feature embedding space. As the word feature vectors are learned by Word2Vec trained on Wikipedia corpus, the common embedding space can be queried by a wide variety of words that are not included in the labeled 3D model dataset. Experimental evaluation has shown that, with the proposed algorithm, 3D shape can be queried by labels of either whole or part shape, or labels that are semantically close but not included in the original 3D model dataset.

[1]  Peter K. Allen,et al.  Autotagging to improve text search for 3d models , 2008, Shape Modeling International.

[2]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[3]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Ryutarou Ohbuchi,et al.  Shape-Based Autotagging of 3D Models for Retrieval , 2009, SAMT.

[5]  Leonidas J. Guibas,et al.  A scalable active framework for region annotation in 3D shape collections , 2016, ACM Trans. Graph..

[6]  Edmond Boyer,et al.  FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[8]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Duc Anh Duong,et al.  Deformable Shape Retrieval with Missing Parts , 2017, 3DOR@Eurographics.

[11]  Song Wu,et al.  3 D ShapeNets : A Deep Representation for Volumetric Shape Modeling , 2015 .

[12]  Ryutarou Ohbuchi,et al.  Deep Aggregation of Local 3D Geometric Features for 3D Model Retrieval , 2016, BMVC.

[13]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[14]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[16]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).