Feature-Based Transfer Learning Based on Distribution Similarity

Transfer learning has been found helpful at enhancing the target domain’s learning process by transferring useful knowledge from other different but related source domains. In many applications, however, collecting and labeling target information is not only very difficult but also expensive. At the same time, considerable prior experience in this regard exists in other application domains. This paper proposes a feature-based transfer learning method based on distribution similarity that aims at the partial overlap of features between two domains. The non-overlapping features are completed by leveraging the distribution similarity of other features within the source domain. Features of the two domains are then reweighted in accordance with the distribution similarity between the source and target domains. This, in turn, decreases the distribution discrepancy between the two domains, therefore achieving the desired feature transfer. Results of the experiments performed on Facebook and Sina Microblog data sets demonstrate that the proposed method is capable of effectively enhancing the accuracy of the prediction function.

[1]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[2]  Michael Bernico,et al.  Investigating the Impact of Data Volume and Domain Similarity on Transfer Learning Applications , 2017, Proceedings of the Future Technologies Conference (FTC) 2018.

[3]  Josef Kittler,et al.  Transductive Transfer Machine , 2014, ACCV.

[4]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[5]  S. Gosling,et al.  Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines. , 2015, The American psychologist.

[6]  Jin Young Choi,et al.  Transfer Learning of Motion Patterns in Traffic Scene via Convex Optimization , 2014, 2014 22nd International Conference on Pattern Recognition.

[7]  Yi-Ting Chiang,et al.  Knowledge Source Selection by Estimating Distance between Datasets , 2012, 2012 Conference on Technologies and Applications of Artificial Intelligence.

[8]  He Li,et al.  Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog , 2013, Brain and Health Informatics.

[9]  Marleen de Bruijne,et al.  Weighting training images by maximizing distribution similarity for supervised segmentation across scanners , 2015, Medical Image Anal..

[10]  Marleen de Bruijne,et al.  A Transfer-Learning Approach to Image Segmentation Across Scanners by Maximizing Distribution Similarity , 2013, MLMI.

[11]  Tingshao Zhu,et al.  Evaluating the Validity of Simplified Chinese Version of LIWC in Detecting Psychological Expressions in Short Texts on Social Network Services , 2016, PloS one.

[12]  Yi Zhang,et al.  A Personality Model Based on NEO PI-R for Emotion Simulation , 2014, IEICE Trans. Inf. Syst..

[13]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[14]  Jaime G. Carbonell,et al.  Feature Selection for Transfer Learning , 2011, ECML/PKDD.

[15]  Hyunjung Shin,et al.  Network mirroring for drug repositioning , 2017, BMC Medical Informatics and Decision Making.

[16]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[17]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[18]  Lin Li,et al.  Predicting Active Users' Personality Based on Micro-Blogging Behaviors , 2014, PloS one.

[19]  Jiawei Han,et al.  Learning a Kernel for Multi-Task Clustering , 2011, AAAI.