RGB-D object recognition based on RGBD-PCANet learning

In this paper, a simple deep learning method namely RGBD-PCANet is proposed for object recognition effectively. The proposed method extends the original PCANet for RGB-D images. Firstly, the RGB and depth images are preprocessed to meet the requirement of the network input layer. Secondly, features of RGB-D images are extracted by the two stages RGBD-PCANet which consists of cascaded PCA, binary hashing, and block-wise histograms. Finally, the SVM method is used as classifier. We evaluate the proposed method on the popular Washington RGB-D Object dataset. Extensive experiments demonstrate that the proposed RGBD-PCANet method achieves comparable performance to state-of-the-art CNN-based methods and the runtimes are low without GPU acceleration.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  W. Marsden I and J , 2012 .

[3]  Sven Behnke,et al.  RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[5]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Dieter Fox,et al.  Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Xinbo Gao,et al.  2DPCANet: Dayside Aurora Classification Based on Deep Learning , 2015, CCCV.

[9]  Fuqiang Chen,et al.  Subset based deep learning for RGB-D object recognition , 2015, Neurocomputing.

[10]  Dieter Fox,et al.  Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Jiwen Lu,et al.  PCANet: A Simple Deep Learning Baseline for Image Classification? , 2014, IEEE Transactions on Image Processing.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Heinrich H. Bülthoff,et al.  Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[15]  Changshui Zhang,et al.  DeepFish: Accurate underwater live fish recognition with a deep architecture , 2016, Neurocomputing.

[16]  Martin A. Riedmiller,et al.  A learned feature descriptor for object recognition in RGB-D data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[17]  Gary R. Bradski,et al.  Fast 3D recognition and pose using the Viewpoint Feature Histogram , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[19]  Wolfram Burgard,et al.  Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[21]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[22]  Hanzi Wang,et al.  Scene character recognition using PCANet , 2015, ICIMCS '15.

[23]  Chun Yuan,et al.  Weighted-PCANet for Face Recognition , 2015, ICONIP.

[24]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Lei Tian,et al.  Stacked PCA Network (SPCANet): An effective deep learning for face recognition , 2015, 2015 IEEE International Conference on Digital Signal Processing (DSP).

[27]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[29]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.