Deep Aggregation of Local 3D Geometric Features for 3D Model Retrieval

Aggregation of local features is a well-studied approach for image as well as 3D model retrieval (3DMR). A carefully designed local 3D geometric feature is able to describe detailed local geometry of 3D model, often with invariance to geometric transformations that include 3D rotation of local 3D regions. For efficient 3DMR, these local features are aggregated into a feature per 3D model. A recent alternative, end-toend 3D Deep Convolutional Neural Network (3D-DCNN) [7][33], has achieved accuracy superior to the abovementioned aggregation-of-local-features approach. However, current 3D-DCNN based methods have weaknesses; they lack invariance against 3D rotation, and they often miss detailed geometrical features due to their quantization of shapes into coarse voxels in applying 3D-DCNN. In this paper, we propose a novel deep neural network for 3DMR called Deep Local feature Aggregation Network (DLAN) that combines extraction of rotation-invariant 3D local features and their aggregation in a single deep architecture. The DLAN describes local 3D regions of a 3D model by using a set of 3D geometric features invariant to local rotation. The DLAN then aggregates the set of features into a (global) rotation-invariant and compact feature per 3D model. Experimental evaluation shows that the DLAN outperforms the existing deep learning-based 3DMR algorithms.

[1]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[2]  Ryutarou Ohbuchi,et al.  Fusing Multiple Features for Shape-based 3D Model Retrieval , 2014, BMVC.

[3]  Bin Fang,et al.  A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries , 2015, Comput. Vis. Image Underst..

[4]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[5]  Daniel Cohen-Or,et al.  Contextual Part Analogies in 3D Objects , 2010, International Journal of Computer Vision.

[6]  Ryutarou Ohbuchi,et al.  Non-rigid 3D Model Retrieval Using Set of Local Statistical Features , 2012, 2012 IEEE International Conference on Multimedia and Expo Workshops.

[7]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[8]  Pierre Vandergheynst,et al.  Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[9]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Florent Perronnin,et al.  Fisher vectors meet Neural Networks: A hybrid classification architecture , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[12]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Junwei Han,et al.  Local deep feature learning framework for 3D shape , 2015, Comput. Graph..

[15]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[16]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[17]  Yi Liu,et al.  Shape Topics: A Compact Representation and New Algorithms for 3D Partial Shape Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Ryutarou Ohbuchi,et al.  Diffusion-on-Manifold Aggregation of Local Features for Shape-based 3D Model Retrieval , 2015, ICMR.

[21]  Terrence J. Sejnowski,et al.  Edges are the Independent Components of Natural Scenes , 1996, NIPS.

[22]  Ryutarou Ohbuchi,et al.  Salient local visual features for shape-based 3D model retrieval , 2008, 2008 IEEE International Conference on Shape Modeling and Applications.

[23]  Aaron Hertzmann,et al.  Learning 3D mesh segmentation and labeling , 2010, SIGGRAPH 2010.

[24]  Bo Li,et al.  Large-Scale 3D Shape Retrieval from ShapeNet Core55 , 2016, 3DOR@Eurographics.

[25]  A. Khosla,et al.  A Deep Representation for Volumetric Shape Modeling , 2015 .

[26]  Andrea Giachetti,et al.  SHREC ’ 15 Track : Non-rigid 3 D Shape Retrieval † , 2016 .

[27]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[28]  Leonidas J. Guibas,et al.  A concise and provably informative multi-scale signature based on heat diffusion , 2009 .

[29]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[30]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[31]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Longin Jan Latecki,et al.  GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.