论文信息 - 3D shape retrieval using a single depth image from low-cost sensors

3D shape retrieval using a single depth image from low-cost sensors

Content-based 3D shape retrieval is an important problem in computer vision. Traditional retrieval interfaces require a 2D sketch or a manually designed 3D model as the query, which is difficult to specify and thus not practical in real applications. With the recent advance in low-cost 3D sensors such as Microsoft Kinect and Intel Realsense, capturing depth images that carry 3D information is fairly simple, making shape retrieval more practical and user-friendly. In this paper, we study the problem of cross-domain 3D shape retrieval using a single depth image from low-cost sensors as the query to search for similar human designed CAD models. We propose a novel method using an ensemble of autoencoders in which each autoencoder is trained to learn a compressed representation of depth views synthesize d from each database object. By viewing each autoencoder as a probabilistic model, a likelihood score can be derived as a similarity measure. A domain adaptation layer is built on top of autoencoder outputs to explicitly address the cross-domain issue (between noisy sensory data and clean 3D models) by incorporating training data of sensor depth images and their category labels in a weakly supennsed learning formulation. Experiments using real-world depth images and a large-scale CAD dataset demonstrate the effectiveness of our approach, which offers significant improvements over state-of-the-art 3D shape retrieval methods.

[1] Zhang Xiong,et al. 3D object retrieval with stacked local convolutional autoencoder , 2015, Signal Process..

[2] Andrew W. Fitzgibbon,et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[3] Roland Memisevic,et al. On autoencoder scoring , 2013, ICML.

[4] Tobias Schreck,et al. SHREC'13 Track: Large-Scale Partial Shape Retrieval Using Simulated Range Images , 2013, 3DOR@Eurographics.

[5] Geoffrey E. Hinton,et al. Modeling the joint density of two images under a variety of transformations , 2011, CVPR 2011.

[6] Xiaogang Wang,et al. Hierarchical face parsing via deep learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Ming-Hsuan Yang,et al. Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Thomas A. Funkhouser,et al. The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[9] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10] Vladlen Koltun,et al. Geodesic Object Proposals , 2014, ECCV.

[11] Ryutarou Ohbuchi,et al. SHREC'12 Track: Generic 3D Shape Retrieval , 2012, 3DOR@Eurographics.

[12] Alexei A. Efros,et al. Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[13] R. Horaud,et al. Surface feature detection and description with applications to mesh matching , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Dieter Fox,et al. A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[15] Ming Ouhyoung,et al. On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[16] Yoshua Bengio,et al. What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..

[17] Quoc V. Le,et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.

[18] Jun Wang,et al. From Low-Cost Depth Sensors to CAD: Cross-Domain 3D Shape Retrieval via Regression Tree Fields , 2014, ECCV.

[19] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[20] Andrew E. Johnson,et al. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[21] Petros Daras,et al. SHREC'09 Track: Querying with Partial Models , 2009, 3DOR@Eurographics.

[22] Leonidas J. Guibas,et al. One Point Isometric Matching with the Heat Kernel , 2010, Comput. Graph. Forum.

[23] Jian Sun,et al. Salient object detection by composition , 2011, 2011 International Conference on Computer Vision.

[24] Rongrong Ji,et al. Label Propagation from ImageNet to 3D Point Clouds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[26] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[27] Stan Sclaroff,et al. Improved feature descriptors for 3D surface matching , 2007, SPIE Optics East.

[28] Jianxiong Xiao,et al. Sliding Shapes for 3D Object Detection in Depth Images , 2014, ECCV.

[29] Dieter Fox,et al. Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30] Petros Daras,et al. A Compact Multi-view Descriptor for 3D Object Retrieval , 2009, 2009 Seventh International Workshop on Content-Based Multimedia Indexing.