Diffusion-on-Manifold Aggregation of Local Features for Shape-based 3D Model Retrieval

Aggregating a set of local features has become one of the most common approaches for representing a multi-media data such as 2D image and 3D model. The success of Bag-of-Features (BF) aggregation [2] prompted several extensions to BF, that are, VLAD [12], Fisher Vector (FV) coding [22] and Super Vector (SV) coding [34]. They all learn small number of codewords, or representative local features, by clustering a set of large number of local features. The set of local features extracted from a media data (e.g., an image) is encoded by considering distribution of features around the codewords; BF uses frequency, VLAD and FV uses displacement vector, and SV uses a combination of both. In doing so, these encoding algorithms assume linearity of feature space about a codeword. Consequently, even if the set of features form a non-linear manifold, its non-linearity would be ignored, potentially degrading quality of aggregated features. In this paper, we propose a novel feature aggregation algorithm called Diffusion-on-Manifold (DM) that tries to take into account, via diffusion distance, structure of non-linear manifold formed by the set of local features. In view of 3D shape retrieval, we also propose a local 3D shape feature defined for oriented point set. Experiments using shape-based 3D model retrieval scenario show that the DM aggregation results in better retrieval accuracy than the existing aggregation algorithms we've compared against, that are, VLAD, FV, and SV, etc..

[1]  Ahmed M. Elgammal,et al.  Putting local features on a manifold , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  David Picard,et al.  Improving image similarity with vectors of locally aggregated tensors , 2011, 2011 18th IEEE International Conference on Image Processing.

[3]  Ryutarou Ohbuchi,et al.  Fusing Multiple Features for Shape-based 3D Model Retrieval , 2014, BMVC.

[4]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Hao Zhang,et al.  Robust 3D Shape Correspondence in the Spectral Domain , 2006, IEEE International Conference on Shape Modeling and Applications 2006 (SMI'06).

[6]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[7]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Yi Liu,et al.  Shape Topics: A Compact Representation and New Algorithms for 3D Partial Shape Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Ryutarou Ohbuchi,et al.  Non-rigid 3D Model Retrieval Using Set of Local Statistical Features , 2012, 2012 IEEE International Conference on Multimedia and Expo Workshops.

[10]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[12]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[13]  Pingkun Yan,et al.  SIFT on manifold: An intrinsic description , 2013, Neurocomputing.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Qi Tian,et al.  Lp-Norm IDF for Large Scale Image Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Bin Fang,et al.  Large Scale Comprehensive 3D Shape Retrieval , 2014, 3DOR@Eurographics.

[17]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[18]  Yue Gao,et al.  View-Based 3D Object Retrieval: Challenges and Approaches , 2014, IEEE MultiMedia.

[19]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[20]  Mohammed Bennamoun,et al.  Rotational Projection Statistics for 3D Local Surface Description and Object Recognition , 2013, International Journal of Computer Vision.

[21]  Ryutarou Ohbuchi,et al.  Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features , 2009, CIVR '09.

[22]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[23]  Horst Bischof,et al.  Diffusion Processes for Retrieval Revisited , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Paul Suetens,et al.  SHREC '11 Track: Shape Retrieval on Non-rigid 3D Watertight Meshes , 2011, 3DOR@Eurographics.

[25]  Ligang Liu,et al.  Mesh saliency via ranking unsalient patches in a descriptor space , 2015, Comput. Graph..

[26]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[27]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[28]  Ryutarou Ohbuchi,et al.  Scale-weighted dense bag of visual features for 3D model retrieval from a partial view 3D model , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[29]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[30]  Karthik Ramani,et al.  Developing an engineering shape benchmark for CAD models , 2006, Comput. Aided Des..

[31]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Lei Wang,et al.  In defense of soft-assignment coding , 2011, 2011 International Conference on Computer Vision.

[33]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[34]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[35]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.