论文信息 - Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition

Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition

This paper studies the problem of RGB-D object recognition. Inspired by the great success of deep convolutional neural networks (DCNN) in AI, researchers have tried to apply it to improve the performance of RGB-D object recognition. However, DCNN always requires a large-scale annotated dataset to supervise its training. Manually labeling such a large RGB-D dataset is expensive and time consuming, which prevents DCNN from quickly promoting this research area. To address this problem, we propose a semi-supervised multimodal deep learning framework to train DCNN effectively based on very limited labeled data and massive unlabeled data. The core of our framework is a novel diversity preserving co-training algorithm, which can successfully guide DCNN to learn from the unlabeled RGB-D data by making full use of the complementary cues of the RGB and depth data in object representation. Experiments on the benchmark RGB-D dataset demonstrate that, with only 5% labeled training data, our approach achieves competitive performance for object recognition compared with those state-of-the-art results reported by fully-supervised methods.

[1] Xiaojin Zhu,et al. Semi-Supervised Learning Literature Survey , 2005 .

[2] Xin Zhao,et al. Query Adaptive Similarity Measure for RGB-D Object Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[4] Kaiqi Huang,et al. Convolutional Fisher Kernels for RGB-D Object Recognition , 2015, 2015 International Conference on 3D Vision.

[5] Dieter Fox,et al. Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms , 2011, NIPS.

[6] Dieter Fox,et al. A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[7] Dieter Fox,et al. Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8] Saturnino Maldonado-Bascón,et al. SURFing the point clouds: Selective 3D spatial pyramids for category-level object recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Maria-Florina Balcan,et al. Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.

[10] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Jitendra Malik,et al. Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[12] D. T. Lee,et al. Unsupervised Feature Learning for RGB-D Image Classification , 2014, ACCV.

[13] Kaiqi Huang,et al. Semi-supervised learning and feature evaluation for RGB-D object recognition , 2015, Comput. Vis. Image Underst..

[14] Polina Golland,et al. Convex Clustering with Exemplar-Based Models , 2007, NIPS.

[15] Jason Weston,et al. Deep learning via semi-supervised embedding , 2008, ICML '08.

[16] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17] Dieter Fox,et al. Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[18] Martin A. Riedmiller,et al. A learned feature descriptor for object recognition in RGB-D data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[19] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[20] Jiwen Lu,et al. MMSS: Multi-modal Sharable and Specific Feature Learning for RGB-D Object Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21] Wolfram Burgard,et al. Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22] Andrew Y. Ng,et al. Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[23] Sven Behnke,et al. RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[24] Tieniu Tan,et al. Semi-supervised Learning for RGB-D Object Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[25] Dieter Fox,et al. Sparse distance learning for object recognition combining RGB and depth information , 2011, 2011 IEEE International Conference on Robotics and Automation.