Ensemble of Anchor Adapters for Transfer Learning

In the past decade, there have been a large number of transfer learning algorithms proposed for various real-world applications. However, most of them are vulnerable to negative transfer since their performance is even worse than traditional supervised models. Aiming at more robust transfer learning models, we propose an ENsemble framework of anCHOR adapters (ENCHOR for short), in which an anchor adapter adapts the features of instances based on their similarities to a specific anchor (i.e., a selected instance). Specifically, the more similar to the anchor instance, the higher degree of the original feature of an instance remains unchanged in the adapted representation, and vice versa. This adapted representation for the data actually expresses the local structure around the corresponding anchor, and then any transfer learning method can be applied to this adapted representation for a prediction model, which focuses more on the neighborhood of the anchor. Next, based on multiple anchors, multiple anchor adapters can be built and combined into an ensemble for final output. Additionally, we develop an effective measure to select the anchors for ensemble building to achieve further performance improvement. Extensive experiments on hundreds of text classification tasks are conducted to demonstrate the effectiveness of ENCHOR. The results show that: when traditional supervised models perform poorly, ENCHOR (based on only 8 selected anchors) achieves $6%-13%$ increase in terms of average accuracy compared with the state-of-the-art methods, and it greatly alleviates negative transfer.

[1]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[2]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[3]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[4]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[5]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[6]  Dacheng Tao,et al.  Bregman Divergence-Based Regularization for Transfer Subspace Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Koby Crammer,et al.  Multi-domain learning by confidence-weighted parameter combination , 2010, Machine Learning.

[8]  HofmannThomas Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2001 .

[9]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[10]  Hui Xiong,et al.  Transfer learning from multiple source domains via consensus regularization , 2008, CIKM '08.

[11]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[12]  Hui Xiong,et al.  Cross-Domain Learning from Multiple Sources: A Consensus Regularization Perspective , 2010, IEEE Transactions on Knowledge and Data Engineering.

[13]  Hui Xiong,et al.  Exploiting associations between word clusters and document classes for cross-domain text categorization , 2011, Stat. Anal. Data Min..

[14]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[15]  Dong Xu,et al.  Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[17]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[18]  Lawrence Carin,et al.  Logistic regression with an auxiliary data source , 2005, ICML.

[19]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.

[20]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[21]  Jing Gao,et al.  On handling negative transfer and imbalanced distributions in multiple source transfer learning , 2014, SDM.

[22]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[23]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .