论文信息 - Shape2Vec: semantic-based descriptors for 3D shapes, sketches and images

Shape2Vec: semantic-based descriptors for 3D shapes, sketches and images

Convolutional neural networks have been successfully used to compute shape descriptors, or jointly embed shapes and sketches in a common vector space. We propose a novel approach that leverages both labeled 3D shapes and semantic information contained in the labels, to generate semantically-meaningful shape descriptors. A neural network is trained to generate shape descriptors that lie close to a vector representation of the shape class, given a vector space of words. This method is easily extendable to range scans, hand-drawn sketches and images. This makes cross-modal retrieval possible, without a need to design different methods depending on the query type. We show that sketch-based shape retrieval using semantic-based descriptors outperforms the state-of-the-art by large margins, and mesh-based retrieval generates results of higher relevance to the query, than current deep shape descriptors.

Neil A. Dodgson | Flora Ponjou Tasse

[1] Bo Li,et al. Large-Scale 3D Shape Retrieval from ShapeNet Core55 , 2016, 3DOR@Eurographics.

[2] Marc Alexa,et al. How do humans sketch objects? , 2012, ACM Trans. Graph..

[3] Laurens van der Maaten,et al. Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[4] Vladlen Koltun,et al. A Large Dataset of Object Scans , 2016, ArXiv.

[5] Szymon Rusinkiewicz,et al. Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[6] Claus Nebauer,et al. Evaluation of convolutional neural networks for visual recognition , 1998, IEEE Trans. Neural Networks.

[7] SchreckTobias,et al. A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries , 2015 .

[8] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[10] Pierre Vandergheynst,et al. Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[11] J. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[12] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[13] Fang Wang,et al. Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Bin Fang,et al. A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries , 2015, Comput. Vis. Image Underst..

[15] Leonidas J. Guibas,et al. Joint embeddings of shapes and images via CNN image purification , 2015, ACM Trans. Graph..

[16] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[17] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18] Longin Jan Latecki,et al. GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Gang Wang,et al. Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition , 2015, IEEE Transactions on Multimedia.

[20] Bin Fang,et al. SHREC'14 Track: Large Scale Comprehensive 3D Shape Retrieval , 2014 .

[21] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[22] Subhransu Maji,et al. Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Xiaogang Wang,et al. Learning Semantic Signatures for 3D Object Retrieval , 2013, IEEE Transactions on Multimedia.

[25] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[26] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.

[27] Alexander M. Bronstein,et al. Recent Trends, Applications, and Perspectives in 3D Shape Similarity Assessment , 2016, Comput. Graph. Forum.

[28] Peter Kulchyski. and , 2015 .

[29] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[30] Daniel Cremers,et al. Anisotropic Diffusion Descriptors , 2016, Comput. Graph. Forum.

[31] Bo Li,et al. Extended Large Scale Sketch-Based 3D Shape Retrieval , 2014, 3DOR@Eurographics.

[32] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[33] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[34] Tao Xiang,et al. Sketch-a-Net that Beats Humans , 2015, BMVC.