A survey of transfer learning

Machine learning and data mining techniques have been used in numerous real-world applications. An assumption of traditional machine learning methodologies is the training data and testing data are taken from the same domain, such that the input feature space and data distribution characteristics are the same. However, in some real-world machine learning scenarios, this assumption does not hold. There are cases where training data is expensive or difficult to collect. Therefore, there is a need to create high-performance learners trained with more easily obtained data from different domains. This methodology is referred to as transfer learning. This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied to transfer learning. Lastly, there is information listed on software downloads for various transfer learning solutions and a discussion of possible future research work. The transfer learning solutions surveyed are independent of data size and can be applied to big data environments.

[1]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Jian Yu,et al.  Learning Transferred Weights From Co-Occurrence Data for Heterogeneous Transfer Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[4]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[5]  Ivor W. Tsang,et al.  Heterogeneous Domain Adaptation for Multiple Classes , 2014, AISTATS.

[6]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[7]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[8]  Sethuraman Panchanathan,et al.  Multi-source domain adaptation and its application to early detection of fatigue , 2011, KDD.

[9]  Foster J. Provost,et al.  Machine learning for targeted display advertising: transfer learning in action , 2013, Machine Learning.

[10]  Diane J. Cook,et al.  Transfer Learning across Feature-Rich Heterogeneous Feature Spaces via Feature-Space Remapping (FSR) , 2015, ACM Trans. Intell. Syst. Technol..

[11]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[13]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[14]  Fei Wang,et al.  Social recommendation across multiple relational domains , 2012, CIKM.

[15]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[16]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[17]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[18]  Ivor W. Tsang,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Domain Adaptation from Multiple Sources: A Domain- , 2022 .

[19]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[20]  Philip S. Yu,et al.  Adaptation Regularization: A General Framework for Transfer Learning , 2014, IEEE Transactions on Knowledge and Data Engineering.

[21]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[22]  Jian Dong,et al.  Contextualizing Object Detection and Classification , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Erik Marchi,et al.  Sparse Autoencoder-Based Feature Transfer Learning for Speech Emotion Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[24]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[25]  Dong Xu,et al.  Action recognition using context and appearance distribution features , 2011, CVPR 2011.

[26]  Yutao Ma,et al.  Towards Cross-Project Defect Prediction with Imbalanced Feature Sets , 2014, ArXiv.

[27]  Qiang Yang,et al.  Cross-Domain Co-Extraction of Sentiment and Topic Lexicons , 2012, ACL.

[28]  Ali Farhadi,et al.  Transfer Learning in Sign language , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Raymond J. Mooney,et al.  Transfer Learning by Mapping with Minimal Target Data , 2008 .

[30]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[31]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[32]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[33]  Derek Hoiem,et al.  Building text features for object image classification , 2009, CVPR.

[34]  Iryna Gurevych,et al.  Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields , 2010, EMNLP.

[35]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[37]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.

[38]  Dit-Yan Yeung,et al.  Transfer metric learning by learning task relationships , 2010, KDD.

[39]  Min Xiao,et al.  Semi-Supervised Kernel Matching for Domain Adaptation , 2012, AAAI.

[40]  Gunnar Rätsch,et al.  An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis , 2008, NIPS.

[41]  Jenna Wiens,et al.  A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions , 2014, J. Am. Medical Informatics Assoc..

[42]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[43]  Deepak S. Turaga,et al.  Cross domain distribution adaptation via kernel mapping , 2009, KDD.

[44]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[45]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[46]  Wei Gong,et al.  Transfer learning used to analyze the dynamic evolution of the dust aerosol , 2015 .

[47]  M. Ng,et al.  Co-transfer learning via joint transition probability graph based method , 2012, CDKD '12.

[48]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[49]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[50]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[51]  Rama Chellappa,et al.  Visual Domain Adaptation: A survey of recent advances , 2015, IEEE Signal Processing Magazine.

[52]  Sunghun Kim,et al.  Reducing Features to Improve Code Change-Based Bug Prediction , 2013, IEEE Transactions on Software Engineering.

[53]  Diane J. Cook,et al.  Transfer learning for activity recognition: a survey , 2013, Knowledge and Information Systems.

[54]  Yuan Shi,et al.  Information-Theoretical Learning of Discriminative Clusters for Unsupervised Domain Adaptation , 2012, ICML.

[55]  Maayan Harel,et al.  Learning from Multiple Outlooks , 2010, ICML.

[56]  Subramanian Ramanathan,et al.  Exploring Transfer Learning Approaches for Head Pose Classification from Multi-view Surveillance Images , 2013, International Journal of Computer Vision.

[57]  Qiang Yang,et al.  Transfer Learning for Collective Link Prediction in Multiple Heterogenous Domains , 2010, ICML.

[58]  Shyam Visweswaran,et al.  Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data , 2015, BMC Bioinformatics.

[59]  Qiang Yang,et al.  Active Transfer Learning for Cross-System Recommendation , 2013, AAAI.

[60]  M. Kloft,et al.  l p -Norm Multiple Kernel Learning , 2011 .

[61]  Qiang Yang,et al.  Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction , 2009, IJCAI.

[62]  Daniel D. Lee,et al.  Learning High Dimensional Correspondences from Low Dimensional Manifolds , 2003 .

[63]  Qiang Yang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Transfer Learning to Predict Missing Ratings via Heterogeneous User Feedbacks , 2022 .

[64]  Stefano Ermon,et al.  Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping , 2015, AAAI.

[65]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[66]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[67]  Jun Huan,et al.  Large margin transductive transfer learning , 2009, CIKM.

[68]  Dacheng Tao,et al.  Bregman Divergence-Based Regularization for Transfer Subspace Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[69]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[70]  Chang Wang,et al.  Heterogeneous Domain Adaptation Using Manifold Alignment , 2011, IJCAI.

[71]  Guy Shani,et al.  TALMUD: transfer learning for multiple domains , 2012, CIKM.

[72]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[73]  Ivor W. Tsang,et al.  Hybrid Heterogeneous Transfer Learning through Deep Learning , 2014, AAAI.

[74]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[75]  Bin Cao,et al.  Multi-Domain Collaborative Filtering , 2010, UAI.

[76]  Tim Menzies,et al.  Heterogeneous Defect Prediction , 2018, IEEE Trans. Software Eng..

[77]  Yizhou Sun,et al.  Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models , 2009, NIPS.

[78]  Harikrishna Chaganti,et al.  Social Transfer Cross Domain Transfer Learning From Social Streams for Media Applications , 2016 .

[79]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[80]  Gunnar Rätsch,et al.  Multitask Learning in Computational Biology , 2011, ICML Unsupervised and Transfer Learning.

[81]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[82]  Ivor W. Tsang,et al.  Learning with Augmented Features for Heterogeneous Domain Adaptation , 2012, ICML.

[83]  Barbara Caputo,et al.  The More You Know, the Less You Learn: From Knowledge Transfer to One-shot Learning of Object Categories , 2009, BMVC.

[84]  Jing Gao,et al.  On handling negative transfer and imbalanced distributions in multiple source transfer learning , 2014, SDM.

[85]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[86]  Mirella Lapata,et al.  Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics , 1999, ACL 1999.

[87]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[88]  Rui Xia,et al.  A POS-based Ensemble Model for Cross-domain Sentiment Classification , 2011, IJCNLP.

[89]  Tao Mei,et al.  SocialTransfer: cross-domain transfer learning from social streams for media applications , 2012, ACM Multimedia.

[90]  Christopher Joseph Pal,et al.  Heterogeneous Transfer Learning with RBMs , 2011, AAAI.

[91]  Mikhail Belkin,et al.  Manifold Regularization : A Geometric Framework for Learning from Examples , 2004 .

[92]  Massimiliano Pontil,et al.  Transfer learning to account for idiosyncrasy in face and body expressions , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[93]  Ivor W. Tsang,et al.  Combating Negative Transfer From Predictive Distribution Differences , 2013, IEEE Transactions on Cybernetics.

[94]  Qiang Yang,et al.  Spectral domain-transfer learning , 2008, KDD.

[95]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[96]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[97]  Ivor W. Tsang,et al.  Learning With Augmented Features for Supervised and Semi-Supervised Heterogeneous Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[98]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[99]  Shiguang Shan,et al.  Domain Adaptation for Face Recognition: Targetize Source Domain Bridged by Common Subspace , 2013, International Journal of Computer Vision.

[100]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[101]  Chengqing Zong,et al.  Multi-domain adaptation for sentiment classification: Using multiple classifier combining methods , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[102]  Nello Cristianini,et al.  Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis , 2002, NIPS.

[103]  Shih-Fu Chang,et al.  Cross-domain learning methods for high-level visual concept classification , 2008, 2008 15th IEEE International Conference on Image Processing.

[104]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[105]  Barbara Caputo,et al.  Safety in numbers: Learning categories from few examples with multi model knowledge transfer , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[106]  Alexander Zien,et al.  lp-Norm Multiple Kernel Learning , 2011, J. Mach. Learn. Res..

[107]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[108]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[109]  Philip S. Yu,et al.  Transfer Learning on Heterogenous Feature Spaces via Spectral Transformation , 2010, 2010 IEEE International Conference on Data Mining.

[110]  Hui Xiong,et al.  Transfer learning from multiple source domains via consensus regularization , 2008, CIKM '08.

[111]  Christopher Joseph Pal,et al.  Cross Lingual Adaptation: An Experiment on Sentiment Classifications , 2010, ACL.

[112]  Lorenzo Bruzzone,et al.  Domain Adaptation Problems: A DASVM Classification Technique and a Circular Validation Strategy , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[113]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[114]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[115]  Hang Li,et al.  Proceedings of the 1st International Workshop on Cross Domain Knowledge Discovery in Web and Social Network Mining , 2012, KDD 2012.

[116]  Cordelia Schmid,et al.  Learning Object Representations for Visual Object Class Recognition , 2007, ICCV 2007.

[117]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[118]  Qiang Yang,et al.  Transfer Learning in Collaborative Filtering for Sparsity Reduction , 2010, AAAI.

[119]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[120]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[121]  Rui Xia,et al.  Feature Ensemble Plus Sample Selection: Domain Adaptation for Sentiment Classification , 2013, IEEE Intelligent Systems.

[122]  Dong Xu,et al.  Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[123]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[124]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[125]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[126]  Ling Shao,et al.  Transfer Learning for Visual Categorization: A Survey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[127]  Eric Eaton,et al.  Modeling Transfer Relationships Between Learning Tasks for Improved Inductive Transfer , 2008, ECML/PKDD.

[128]  Gavin C. Cawley,et al.  Leave-One-Out Cross-Validation Based Model Selection Criteria for Weighted LS-SVMs , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[129]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[130]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[131]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[132]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[133]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[134]  Qiang Yang,et al.  Transfer learning for collaborative filtering via a rating-matrix generative model , 2009, ICML '09.