Distance metric learning and feature combination for shape-based 3D model retrieval

This paper proposes a 3D model retrieval algorithm that employs an unsupervised distance metric learning with a combination of appearance-based features; two sets of local visual features and a set of global features. These visual features are extracted from range images rendered from multiple viewpoints about the 3D model to be compared. The local visual features are bag-of-features histograms of a set of Scale Invariant Feature Transform (SIFT) features by Lowe [7] sampled at either salient or dense and random points. The global visual feature is also a SIFT feature sampled at an image center. The proposed method then uses an unsupervised distance metric learning based on the Manifold Ranking (MR) [15] to compute distances between these features. However, the original MR may not be effective when applied to a set of features having certain distance distribution. We propose an empirical method to adjust the distance profile so that the MR becomes effective. Experiments showed that the retrieval algorithm using a linear combination of distances computed from the proposed set of features by using the modified MR performed well across multiple benchmarks having different characteristics.

[1]  Karthik Ramani,et al.  Three-dimensional shape searching: state-of-the-art review and future trends , 2005, Comput. Aided Des..

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[6]  Ryutarou Ohbuchi,et al.  Salient local visual features for shape-based 3D model retrieval , 2008, 2008 IEEE International Conference on Shape Modeling and Applications.

[7]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[8]  Eric Wahl,et al.  Surflet-pair-relation histograms: a statistical 3D-shape representation for rapid classification , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[9]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[10]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[12]  Remco C. Veltkamp,et al.  A survey of content based 3D shape retrieval methods , 2004, Proceedings Shape Modeling Applications, 2004..

[13]  Ryutarou Ohbuchi,et al.  Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features , 2009, CIVR '09.

[14]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.