Domain Adaptation for Visual Recognition

Domain adaptation is an active, emerging research area that attemptsto address the changes in data distribution across training and testingdatasets. With the availability of a multitude of image acquisition sensors,variations due to illumination, and viewpoint among others, computervision applications present a very natural test bed for evaluatingdomain adaptation methods. In this monograph, we provide a comprehensiveoverview of domain adaptation solutions for visual recognitionproblems. By starting with the problem description and illustrations,we discuss three adaptation scenarios namely, i unsupervised adaptationwhere the "source domain" training data is partially labeledand the "target domain" test data is unlabeled, ii semi-supervisedadaptation where the target domain also has partial labels, and iiimulti-domain heterogeneous adaptation which studies the previous twosettings with the source and/or target having more than one domain,and accounts for cases where the features used to represent the datain each domain are different. For all these topics we discuss existingadaptation techniques in the literature, which are motivated by theprinciples of max-margin discriminative learning, manifold learning,sparse coding, as well as low-rank representations. These techniqueshave shown improved performance on a variety of applications suchas object recognition, face recognition, activity analysis, concept classification,and person detection. We then conclude by analyzing thechallenges posed by the realm of "big visual data", in terms of thegeneralization ability of adaptation algorithms to unconstrained dataacquisition as well as issues related to their computational tractability,and draw parallels with the efforts from vision community on imagetransformation models, and invariant descriptors so as to facilitate improvedunderstanding of vision problems under uncertainty.

[1]  Yuan Shi,et al.  Information-Theoretical Learning of Discriminative Clusters for Unsupervised Domain Adaptation , 2012, ICML.

[2]  John Blitzer,et al.  Domain Adaptation with Coupled Subspaces , 2011, AISTATS.

[3]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Rama Chellappa,et al.  Compositional Dictionaries for Domain Adaptive Face Recognition , 2013, IEEE Transactions on Image Processing.

[5]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[7]  Y. Chikuse Statistics on special manifolds , 2003 .

[8]  Yunde Jia,et al.  Cross-View Action Recognition over Heterogeneous Feature Spaces , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Stefan Carlsson,et al.  Properties of Datasets Predict the Performance of Classifiers , 2013, BMVC.

[10]  Daumé,et al.  Frustratingly Easy Semi-Supervised Domain Adaptation , 2010 .

[11]  Adriana Kovashka,et al.  Attribute Adaptation for Personalized Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Raghuraman Gopalan,et al.  Model-Driven Domain Adaptation on Product Manifolds for Unconstrained Face Recognition , 2014, International Journal of Computer Vision.

[13]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Trevor Darrell,et al.  Efficient Learning of Domain-invariant Image Representations , 2013, ICLR.

[15]  Ying Wu,et al.  Detecting and Aligning Faces by Image Retrieval , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Chunheng Wang,et al.  Cross-View Action Recognition via a Continuous Virtual Path , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Erik G. Learned-Miller,et al.  Online domain adaptation of a pre-trained cascade of classifiers , 2011, CVPR 2011.

[18]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[19]  Vidit Jain,et al.  Adapting Classification Cascades to New Domains , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[24]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Ruonan Li,et al.  Discriminative virtual views for cross-view action recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[27]  Rama Chellappa,et al.  Unsupervised Adaptation Across Domain Shifts by Generating Intermediate Data Representations , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Ralph Gross,et al.  Appearance-based face recognition and light-fields , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[30]  Rama Chellappa,et al.  A Grassmann manifold-based domain adaptation approach , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[31]  Sumit Chopra,et al.  DLID: Deep Learning for Domain Adaptation by Interpolating between Domains , 2013 .

[32]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[33]  Rama Chellappa,et al.  Sparse Representations and Compressive Sensing for Imaging and Vision , 2013, Springer Briefs in Electrical and Computer Engineering.

[34]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[35]  Thomas G. Dietterich,et al.  Improving SVM accuracy by training on auxiliary data sources , 2004, ICML.

[36]  Barbara Caputo,et al.  Safety in numbers: Learning categories from few examples with multi model knowledge transfer , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Antonio Torralba,et al.  Are all training examples equally valuable? , 2013, ArXiv.

[38]  Antonio M. López,et al.  Virtual and Real World Adaptation for Pedestrian Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[40]  Rama Chellappa,et al.  3 D Face Modeling From Monocular Video Sequences , 2005 .

[41]  Ilja Kuzborskij,et al.  Stability and Hypothesis Transfer Learning , 2013, ICML.

[42]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Gang Hua,et al.  Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[44]  Charless C. Fowlkes,et al.  Do We Need More Training Data or Better Models for Object Detection? , 2012, BMVC.

[45]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.

[46]  Ambuj Tewari,et al.  Regularization Techniques for Learning with Matrices , 2009, J. Mach. Learn. Res..

[47]  Lorenzo Torresani,et al.  Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach , 2010, NIPS.

[48]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[49]  Barbara Caputo,et al.  Multiclass transfer learning from unconstrained priors , 2011, 2011 International Conference on Computer Vision.

[50]  S. Mallat,et al.  Invariant Scattering Convolution Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  David Zhang,et al.  Fisher Discrimination Dictionary Learning for sparse representation , 2011, 2011 International Conference on Computer Vision.

[52]  Dong Liu,et al.  Robust visual domain adaptation with low-rank reconstruction , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Koby Crammer,et al.  Learning from Multiple Sources , 2006, NIPS.

[54]  David J. Kriegman,et al.  Nine points of light: acquiring subspaces for face recognition under variable lighting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[55]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[56]  Rama Chellappa,et al.  Robust Estimation of Albedo for Illumination-invariant Matching and Shape Recovery , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[57]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[58]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[59]  Chang Wang,et al.  Heterogeneous Domain Adaptation Using Manifold Alignment , 2011, IJCAI.

[60]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[61]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[62]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[63]  Yong Yu,et al.  Bridged Refinement for Transfer Learning , 2007, PKDD.

[64]  Gavriel Salvendy,et al.  Mathematical Methods for Shape Analysis and form Comparison in 3D Anthropometry: A Literature Review , 2007, HCI.

[65]  Isaac Weiss,et al.  Geometric invariants and object recognition , 1993, International Journal of Computer 11263on.

[66]  Ivor W. Tsang,et al.  Domain Transfer SVM for video concept detection , 2009, CVPR 2009.

[67]  Pascal Fua,et al.  Non-Linear Domain Adaptation with Boosting , 2013, NIPS.

[68]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[70]  Philip S. Yu,et al.  Transfer Learning on Heterogenous Feature Spaces via Spectral Transformation , 2010, 2010 IEEE International Conference on Data Mining.

[71]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[72]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[73]  Larry S. Davis,et al.  Sampling for unsupervised domain adaptive object detection , 2013, 2013 IEEE International Conference on Image Processing.

[74]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[75]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[76]  Rama Chellappa,et al.  Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Trevor Darrell,et al.  Semi-supervised Domain Adaptation with Instance Constraints , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Trevor Darrell,et al.  Transfer learning for image classification with sparse prototype representations , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  James J. Jiang A Literature Survey on Domain Adaptation of Statistical Classifiers , 2007 .

[80]  Alexei A. Efros,et al.  Undoing the Damage of Dataset Bias , 2012, ECCV.

[81]  Anderson Rocha,et al.  Toward Open Set Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[82]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[83]  Rama Chellappa,et al.  Cross-View Action Recognition via a Transferable Dictionary Pair , 2012, BMVC.

[84]  Fei-Fei Li,et al.  Shifting Weights: Adapting Object Detectors from Image to Video , 2012, NIPS.

[85]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[86]  Timo Ahonen,et al.  Recognition of blurred faces using Local Phase Quantization , 2008, 2008 19th International Conference on Pattern Recognition.

[87]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[88]  Ling Shao,et al.  Spatio-Temporal Laplacian Pyramid Coding for Action Recognition , 2014, IEEE Transactions on Cybernetics.

[89]  Kristen Grauman,et al.  Reshaping Visual Datasets for Domain Adaptation , 2013, NIPS.

[90]  Barbara Caputo,et al.  Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[91]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[92]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[93]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.

[94]  Silvio Savarese,et al.  Cross-view action recognition via view knowledge transfer , 2011, CVPR 2011.

[95]  John Blitzer,et al.  Co-Training for Domain Adaptation , 2011, NIPS.

[96]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[97]  Ivor W. Tsang,et al.  Learning With Augmented Features for Supervised and Semi-Supervised Heterogeneous Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[98]  Rama Chellappa,et al.  Generalized Domain-Adaptive Dictionaries , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[99]  Trevor Darrell,et al.  Discovering Latent Domains for Multisource Domain Adaptation , 2012, ECCV.

[100]  Krista A. Ehinger,et al.  SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[101]  Ivor W. Tsang,et al.  Learning with Augmented Features for Heterogeneous Domain Adaptation , 2012, ICML.

[102]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[103]  Rama Chellappa,et al.  Domain Adaptive Dictionary Learning , 2012, ECCV.

[104]  Andrew Zisserman,et al.  Tabula rasa: Model transfer for object category detection , 2011, 2011 International Conference on Computer Vision.

[105]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[106]  Tinne Tuytelaars,et al.  Does evolution cause a domain shift , 2013 .

[107]  H. Karcher Riemannian center of mass and mollifier smoothing , 1977 .

[108]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[109]  D. Jacobs,et al.  Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch , 2011, CVPR 2011.

[110]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[111]  Dong Xu,et al.  Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[112]  Rama Chellappa,et al.  Contour-based 3D Face Modeling from a Monocular Video , 2004, BMVC.

[113]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[114]  Trevor Darrell,et al.  One-Shot Adaptation of Supervised Deep Convolutional Models , 2013, ICLR.

[115]  Raghuraman Gopalan,et al.  Learning Cross-Domain Information Transfer for Location Recognition and Clustering , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[116]  Gang Hua,et al.  Detection by detections: Non-parametric detector adaptation for a video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[117]  David A. Forsyth,et al.  Invariant Descriptors for 3D Object Recognition and Pose , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[118]  Michael Elad,et al.  On the Role of Sparse and Redundant Representations in Image Processing , 2010, Proceedings of the IEEE.

[119]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[120]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[121]  Stéphane Ayache,et al.  Parsimonious unsupervised and semi-supervised domain adaptation with good similarity functions , 2012, Knowledge and Information Systems.

[122]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[123]  Rama Chellappa,et al.  Statistical Computations on Grassmann and Stiefel Manifolds for Image and Video-Based Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  Trevor Darrell,et al.  Towards Adapting ImageNet to Reality: Scalable Domain Adaptation with Implicit Low-rank Transformations , 2013, ArXiv.

[125]  Fernando De la Torre,et al.  Selective Transfer Machine for Personalized Facial Action Unit Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[126]  Yamada Makoto,et al.  No Bias Left Behind: Covariate Shift Adaptation for Discriminative 3D Pose Estimation , 2012 .

[127]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[128]  Barbara Caputo,et al.  Frustratingly Easy NBNN Domain Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[129]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[130]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[131]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[132]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[133]  Rama Chellappa,et al.  Sparse Embedding: A Framework for Sparsity Promoting Dimensionality Reduction , 2012, ECCV.

[134]  Xiao Li,et al.  A Bayesian Divergence Prior for Classiffier Adaptation , 2007, AISTATS.

[135]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[136]  Gang Hua,et al.  Probabilistic Elastic Matching for Pose Variant Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[137]  Dong Xu,et al.  Event Recognition in Videos by Learning from Heterogeneous Web Sources , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.