Domain adaptation from RGB-D to RGB images

The introduction of depth cameras offers an opportunity to utilize the depth images to help the object recognition tasks. However, when our target tasks are classifying RGB images, how can we use the RGB-D images? To deal with this problem, we proposed a novel domain adaptation method by learning from RGB-D images in source domain to recognize RGB images in target domain, named DARDR. By introducing the cross modal constraint and the cross domain constraint, our DARDR can maximize the correlations between RGB and depth images in source domain and minimize the domain discrepancy across domains, simultaneously. We incorporate the two terms into the least-squares classifiers. Furthermore, a unified framework is presented to learn the classifier parameters. The advantage of our method is that the correlation between source RGB and depth images and the discrepancy between source and target data can be incorporated with the classifiers of source and target data. To evaluate our DARDR method, we apply it to five cross domain datasets. The experimental results demonstrate that our method can achieve competing performance against the state-of-art methods for object recognition and scene classification tasks. HighlightsWe aim to use RGB-D images to help recognizing RGB images.The correlation between RGB and depth images in source domain is maximized.The cross domain and cross modal constraints are jointly incorporated in the model.A unified framework is presented to learn the classifier parameters.Extensive experiment results show that depth information is useful for DA problems.

[1]  Xiao Li,et al.  Supervised transfer kernel sparse coding for image classification , 2015, Pattern Recognit. Lett..

[2]  Xiao Li,et al.  Projected Transfer Sparse Coding for cross domain image representation , 2015, J. Vis. Commun. Image Represent..

[3]  Pieter Abbeel,et al.  BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Philip S. Yu,et al.  Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yun Fu,et al.  Learning low-rank and discriminative dictionary for image classification , 2014, Image Vis. Comput..

[6]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[7]  Francis R. Bach,et al.  Trace Lasso: a trace norm regularization for correlated designs , 2011, NIPS.

[8]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Rama Chellappa,et al.  Unsupervised Adaptation Across Domain Shifts by Generating Intermediate Data Representations , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Glenn Fung,et al.  Multicategory Proximal Support Vector Machine Classifiers , 2005, Machine Learning.

[11]  Dacheng Tao,et al.  Bregman Divergence-Based Regularization for Transfer Subspace Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Jean-Luc Dugelay,et al.  KinectFaceDB: A Kinect Database for Face Recognition , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[13]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[14]  Li Ma,et al.  Coupled hidden conditional random fields for RGB-D human action recognition , 2015, Signal Process..

[15]  Saturnino Maldonado-Bascón,et al.  Recognizing in the depth: Selective 3D Spatial Pyramid Matching Kernel for object and scene categorization , 2014, Image Vis. Comput..

[16]  Yu-Chiang Frank Wang,et al.  A discriminative domain adaptation model for cross-domain image classification , 2013, 2013 IEEE International Conference on Image Processing.

[17]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[19]  Jian Sun,et al.  A Practical Transfer Learning Algorithm for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[21]  Zan Gao,et al.  Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition , 2015, Signal Process..

[22]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[23]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[24]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[25]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[26]  Dieter Fox,et al.  Multipath Sparse Coding Using Hierarchical Matching Pursuit , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Nathan Silberman,et al.  Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[28]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[30]  Rama Chellappa,et al.  Visual Domain Adaptation: A survey of recent advances , 2015, IEEE Signal Processing Magazine.

[31]  Ajmal S. Mian,et al.  Using Kinect for face recognition under varying poses, expressions, illumination and disguise , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[32]  Wei Wang,et al.  Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[33]  Matthijs Douze,et al.  Large-scale image classification with trace-norm regularization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[35]  Dong Xu,et al.  Recognizing RGB Images by Learning from RGB-D Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Rama Chellappa,et al.  Coupled Projections for Adaptation of Dictionaries , 2015, IEEE Transactions on Image Processing.

[37]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[38]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.