论文信息 - RGB-D to CAD Retrieval with ObjectNN Dataset

RGB-D to CAD Retrieval with ObjectNN Dataset

The goal of this track is to study and evaluate the performance of 3D object retrieval algorithms using RGB-D data. This is inspired from the practical need to pair an object acquired from a consumer-grade depth camera to CAD models available in public datasets on the Internet. To support the study, we propose ObjectNN, a new dataset with well segmented and annotated RGB-D objects from SceneNN [HPN∗16] and CAD models from ShapeNet [CFG∗15]. The evaluation results show that the RGB-D to CAD retrieval problem, while being challenging to solve due to partial and noisy 3D reconstruction, can be addressed to a good extent using deep learning techniques, particularly, convolutional neural networks trained by multi-view and 3D geometry. The best method in this track scores 82% in accuracy.

[1] Ioannis Pratikakis,et al. PANORAMA: A 3D Shape Descriptor Based on Panoramic Views for Unsupervised 3D Object Retrieval , 2010, International Journal of Computer Vision.

[2] Ralph R. Martin,et al. Partial Shape Queries for 3D Object Retrieval , 2016, 3DOR@Eurographics.

[3] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[5] Bo Li,et al. Sketch-Based 3D Model Retrieval by Viewpoint Entropy-Based Adaptive View Clustering , 2013, 3DOR@Eurographics.

[6] Bo Li,et al. Shape Retrieval of Low-Cost RGB-D Captures , 2016, 3DOR@Eurographics.

[7] Mohammed Bennamoun,et al. Rotational Projection Statistics for 3D Local Surface Description and Object Recognition , 2013, International Journal of Computer Vision.

[8] Huarui Yin,et al. Range Scans based 3D Shape Retrieval , 2015, 3DOR@Eurographics.

[9] Duy-Dinh Le,et al. A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval , 2015, Int. J. Multim. Data Eng. Manag..

[10] Shin'ichi Satoh,et al. Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval , 2013, 2013 IEEE International Conference on Computer Vision.

[11] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12] Jitendra Malik,et al. Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[13] Bo Li,et al. Sketch-based 3D model retrieval utilizing adaptive view clustering and semantic information , 2016, Multimedia Tools and Applications.

[14] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[15] Michael Isard,et al. Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[17] Duc Thanh Nguyen,et al. A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation , 2016, IEEE Transactions on Visualization and Computer Graphics.

[18] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Subhransu Maji,et al. Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Duc Thanh Nguyen,et al. SceneNN: A Scene Meshes Dataset with aNNotations , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[21] Sebastian Scherer,et al. VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[23] Rita Cucchiara,et al. GOLD: Gaussians of Local Descriptors for image representation , 2015, Comput. Vis. Image Underst..