Knowledge Source Selection by Estimating Distance between Datasets

Most traditional machine learning methods make an assumption that the distribution of the training dataset is the same as the applied domain. Transfer learning omits this assumption and is able to transfer knowledge between different domains. It is a promising method to make machine learning technology become more practical. However, negative transfer can hurt the performance of the model, therefore, it should be avoided. In this paper, we focus on how to select a good knowledge source when there are multiple labelled datasets available. A method to estimate the divergence between two labelled datasets is given. In addition, we also provide a method to decide the mappings between features in different datasets. The experimental results show that the divergence estimated by our method is highly related to the performance of the model.

[1]  Christopher Joseph Pal,et al.  Heterogeneous Transfer Learning with RBMs , 2011, AAAI.

[2]  Yuji Matsumoto,et al.  Transfer Learning for Multiple-Domain Sentiment Analysis - Identifying Domain Dependent/Independent Word Polarity , 2011, AAAI.

[3]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[4]  Koby Crammer,et al.  Learning from Multiple Sources , 2006, NIPS.

[5]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[6]  Gwenn Englebienne,et al.  Recognizing Activities in Multiple Contexts using Transfer Learning , 2008, AAAI Fall Symposium: AI in Eldercare: New Solutions to Old Problems.

[7]  Diane J. Cook,et al.  Transfer learning for activity recognition: a survey , 2013, Knowledge and Information Systems.

[8]  Qiang Yang,et al.  Cross-domain activity recognition via transfer learning , 2011, Pervasive Mob. Comput..

[9]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[10]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[11]  G. Englebienne,et al.  Transferring Knowledge of Activity Recognition across Sensor Networks , 2010, Pervasive.

[12]  Dan Zhang,et al.  Transfer Latent Semantic Learning: Microblog Mining with Less Supervision , 2011, AAAI.

[13]  Daphne Koller,et al.  Learning a meta-level prior for feature relevance from multiple related tasks , 2007, ICML '07.

[14]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  L. S. Shapley,et al.  College Admissions and the Stability of Marriage , 2013, Am. Math. Mon..

[16]  Qiang Yang,et al.  Transfer Learning by Structural Analogy , 2011, AAAI.

[17]  J. Edmonds Paths, Trees, and Flowers , 1965, Canadian Journal of Mathematics.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.