Color object recognition via cross-domain learning on RGB-D images

This paper addresses the object recognition problem using multiple-domain inputs. We present a novel approach that utilizes labeled RGB-D data in the training stage, where depth features are extracted for enhancing the discriminative capability of the original learning system that only relies on RGB images. The highly dissimilar source and target domain data are mapped into a unified feature space through transfer at both feature and classifier levels. In order to alleviate cross-domain discrepancy, we employ a state-of-the-art domain-adaptive dictionary learning algorithm that updates image representations in both domains and the classifier parameters simultaneously. The proposed method is trained on a RGB-D Object dataset and evaluated on the Caltech-256 dataset. Experimental results suggest that our approach can lead to significant performance gain over the state-of-the-art methods.

[1]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[2]  Lei Zhang,et al.  Nonlocally Centralized Sparse Representation for Image Restoration , 2013, IEEE Transactions on Image Processing.

[3]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[4]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[5]  Jean-Luc Dugelay,et al.  An Efficient LBP-Based Descriptor for Facial Depth Images Applied to Gender Recognition Using RGB-D Face Data , 2012, ACCV Workshops.

[6]  Michael Elad,et al.  L1-L2 Optimization in Signal and Image Processing , 2010, IEEE Signal Processing Magazine.

[7]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[10]  Dong Xu,et al.  Recognizing RGB Images by Learning from RGB-D Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Ling Shao,et al.  Enhancing Action Recognition by Cross-Domain Dictionary Learning , 2013, BMVC.

[12]  Li Fei-Fei Knowledge transfer in learning to recognize visual objects classes , 2006 .

[13]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[14]  Ling Shao,et al.  Transfer Learning for Visual Categorization: A Survey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[16]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[17]  Dieter Fox,et al.  Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[19]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[20]  Lei Zhang,et al.  Centralized sparse representation for image restoration , 2011, 2011 International Conference on Computer Vision.

[21]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[22]  Silvio Savarese,et al.  Cross-view action recognition via view knowledge transfer , 2011, CVPR 2011.

[23]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[24]  Ling Shao,et al.  Weakly-Supervised Cross-Domain Dictionary Learning for Visual Recognition , 2014, International Journal of Computer Vision.

[25]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.