A Minimax Game for Instance based Selective Transfer Learning

Deep neural network based transfer learning has been widely used to leverage information from the domain with rich data to help domain with insufficient data. When the source data distribution is different from the target data, transferring knowledge between these domains may lead to negative transfer. To mitigate this problem, a typical way is to select useful source domain data for transferring. However, limited studies focus on selecting high-quality source data to help neural network based transfer learning. To bridge this gap, we propose a general Minimax Game based model for selective Transfer Learning (MGTL). More specifically, we build a selector, a discriminator and a TL module in the proposed method. The discriminator aims to maximize the differences between selected source data and target data, while the selector acts as an attacker to selected source data that are close to the target to minimize the differences. The TL module trains on the selected data and provides rewards to guide the selector. Those three modules play a minimax game to help select useful source data for transferring. Our method is also shown to speed up the training process of the learning task in the target domain than traditional TL methods. To the best of our knowledge, this is the first to build a minimax game based model for selective transfer learning. To examine the generality of our method, we evaluate it on two different tasks: item recommendation and text retrieval. Extensive experiments over both public and real-world datasets demonstrate that our model outperforms the competing methods by a large margin. Meanwhile, the quantitative evaluation shows our model can select data which are close to target data. Our model is also deployed in a real-world system and significant improvement over the baselines is observed.

[1]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[2]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[5]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[6]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[7]  Tao Qin,et al.  Learning What Data to Learn , 2017, ArXiv.

[8]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[9]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Rui Yan,et al.  How Transferable are Neural Networks in NLP Applications? , 2016, EMNLP.

[11]  Barbara Plank,et al.  Learning to select data for transfer learning with Bayesian Optimization , 2017, EMNLP.

[12]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[13]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[14]  Yuan Li,et al.  Learning how to Active Learn: A Deep Reinforcement Learning Approach , 2017, EMNLP.

[15]  Fuzhen Zhuang,et al.  Transfer Learning with Manifold Regularized Convolutional Neural Network , 2017, KSEM.

[16]  Jianmin Wang,et al.  Partial Transfer Learning with Selective Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Ruslan Salakhutdinov,et al.  Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks , 2016, ICLR.

[18]  Peter Clark,et al.  SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.

[19]  Tat-Seng Chua,et al.  Neural Factorization Machines for Sparse Predictive Analytics , 2017, SIGIR.

[20]  Li Zhao,et al.  Reinforcement Learning for Relation Classification From Noisy Data , 2018, AAAI.

[21]  John Blitzer,et al.  Co-Training for Domain Adaptation , 2011, NIPS.

[22]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[23]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[24]  Xuanjing Huang,et al.  Adversarial Multi-task Learning for Text Classification , 2017, ACL.

[25]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[26]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[27]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[28]  Jun Huan,et al.  Instance-Based Deep Transfer Learning , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[29]  Russell Greiner,et al.  Robust Learning under Uncertain Test Distributions: Relating Covariate Shift to Model Misspecification , 2014, ICML.

[30]  Hinrich Schütze,et al.  FLORS: Fast and Simple Domain Adaptation for Part-of-Speech Tagging , 2014, TACL.

[31]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[32]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[33]  Xiaoyu Du,et al.  Adversarial Personalized Ranking for Recommendation , 2018, SIGIR.

[34]  Sridhar Mahadevan,et al.  Manifold alignment using Procrustes analysis , 2008, ICML '08.

[35]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[36]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[37]  Yash Patel,et al.  Learning Sampling Policies for Domain Adaptation , 2018, ArXiv.

[38]  Jian Shen,et al.  Wasserstein Distance Guided Representation Learning for Domain Adaptation , 2017, AAAI.

[39]  Ferenc Huszar,et al.  How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , 2015, ArXiv.

[40]  Gang Fu,et al.  Deep & Cross Network for Ad Click Predictions , 2017, ADKDD@KDD.

[41]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[42]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[43]  R. Weisberg A-N-D , 2011 .

[44]  Feng Ji,et al.  Learning to Selectively Transfer: Reinforced Transfer Learning for Deep Text Matching , 2018, WSDM.

[45]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[46]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[47]  Lei Li,et al.  Reinforced Co-Training , 2018, NAACL.

[48]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.