Utilizing Relevant RGB–D Data to Help Recognize RGB Images in the Target Domain

Abstract With the advent of 3D cameras, getting depth information along with RGB images has been facilitated, which is helpful in various computer vision tasks. However, there are two challenges in using these RGB-D images to help recognize RGB images captured by conventional cameras: one is that the depth images are missing at the testing stage, the other is that the training and test data are drawn from different distributions as they are captured using different equipment. To jointly address the two challenges, we propose an asymmetrical transfer learning framework, wherein three classifiers are trained using the RGB and depth images in the source domain and RGB images in the target domain with a structural risk minimization criterion and regularization theory. A cross-modality co-regularizer is used to restrict the two-source classifier in a consistent manner to increase accuracy. Moreover, an L2,1 norm cross-domain co-regularizer is used to magnify significant visual features and inhibit insignificant ones in the weight vectors of the two RGB classifiers. Thus, using the cross-modality and cross-domain co-regularizer, the knowledge of RGB-D images in the source domain is transferred to the target domain to improve the target classifier. The results of the experiment show that the proposed method is one of the most effective ones.

[1]  Lin Chen,et al.  Visual Recognition in RGB Images and Videos by Learning from RGB-D Data , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[3]  Hui Xiong,et al.  A Unified Framework for Metric Transfer Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[4]  Xiao Li,et al.  Domain adaptation from RGB-D to RGB images , 2017, Signal Process..

[5]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[6]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[7]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[8]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[9]  Dieter Fox,et al.  Multipath Sparse Coding Using Hierarchical Matching Pursuit , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[11]  Donald A. Adjeroh,et al.  Information Bottleneck Learning Using Privileged Information for Visual Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Richard Bowden,et al.  Hollywood 3D: Recognizing Actions in 3D Natural Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Peter Tiño,et al.  Incorporating Privileged Information Through Metric Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Adriana Kovashka,et al.  Learning a hierarchy of discriminative space-time neighborhood features for human action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[16]  Sebastian Nowozin,et al.  Let the kernel figure it out; Principled learning of pre-processing for kernel classifiers , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[18]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[19]  Ivor W. Tsang,et al.  Learning With Augmented Features for Supervised and Semi-Supervised Heterogeneous Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[21]  Uwe Aickelin,et al.  Privileged information for data clustering , 2012, Inf. Sci..

[22]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[23]  Dong Xu,et al.  Recognizing RGB Images by Learning from RGB-D Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[25]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[26]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[27]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[28]  Philip S. Yu,et al.  Adaptation Regularization: A General Framework for Transfer Learning , 2014, IEEE Transactions on Knowledge and Data Engineering.

[29]  Jean-Luc Dugelay,et al.  An Efficient LBP-Based Descriptor for Facial Depth Images Applied to Gender Recognition Using RGB-D Face Data , 2012, ACCV Workshops.

[30]  Saeid Motiian,et al.  Information Bottleneck Domain Adaptation with Privileged Information for Visual Recognition , 2016, ECCV.

[31]  Zoltan-Csaba Marton,et al.  Improving object classification robustness in RGB-D using adaptive SVMs , 2015, Multimedia Tools and Applications.

[32]  Christoph H. Lampert,et al.  Learning to Rank Using Privileged Information , 2013, 2013 IEEE International Conference on Computer Vision.

[33]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[34]  Yun Fu,et al.  Discriminative Relational Representation Learning for RGB-D Action Recognition , 2016, IEEE Transactions on Image Processing.

[35]  Raymond J. Mooney,et al.  Mapping and Revising Markov Logic Networks for Transfer Learning , 2007, AAAI.

[36]  Richa Singh,et al.  RGB-D Face Recognition With Texture and Attribute Features , 2014, IEEE Transactions on Information Forensics and Security.