Learning Transferred Weights From Co-Occurrence Data for Heterogeneous Transfer Learning

One of the main research problems in heterogeneous transfer learning is to determine whether a given source domain is effective in transferring knowledge to a target domain, and then to determine how much of the knowledge should be transferred from a source domain to a target domain. The main objective of this paper is to solve this problem by evaluating the relatedness among given domains through transferred weights. We propose a novel method to learn such transferred weights with the aid of co-occurrence data, which contain the same set of instances but in different feature spaces. Because instances with the same category should have similar features, our method is to compute their principal components in each feature space such that co-occurrence data can be rerepresented by these principal components. The principal component coefficients from different feature spaces for the same instance in the co-occurrence data have the same order of significance for describing the category information. By using these principal component coefficients, the Markov Chain Monte Carlo method is employed to construct a directed cyclic network where each node is a domain and each edge weight is the conditional dependence from one domain to another domain. Here, the edge weight of the network can be employed as the transferred weight from a source domain to a target domain. The weight values can be taken as a prior for setting parameters in the existing heterogeneous transfer learning methods to control the amount of knowledge transferred from a source domain to a target domain. The experimental results on synthetic and real-world data sets are reported to illustrate the effectiveness of the proposed method that can capture strong or weak relations among feature spaces, and enhance the learning performance of heterogeneous transfer learning.

[1]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[2]  Yi-Ting Chiang,et al.  Knowledge Source Selection by Estimating Distance between Datasets , 2012, 2012 Conference on Technologies and Applications of Artificial Intelligence.

[3]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Akiko Aizawa,et al.  An information-theoretic perspective of tf-idf measures , 2003, Inf. Process. Manag..

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Ivor W. Tsang,et al.  Heterogeneous Domain Adaptation for Multiple Classes , 2014, AISTATS.

[7]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[8]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[9]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[10]  Paolo Giudici,et al.  Improving Markov Chain Monte Carlo Model Search for Data Mining , 2004, Machine Learning.

[11]  Ivor W. Tsang,et al.  Combating Negative Transfer From Predictive Distribution Differences , 2013, IEEE Transactions on Cybernetics.

[12]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[13]  Ivor W. Tsang,et al.  Transfer Ordinal Label Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Charles Elkan,et al.  Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution , 2006, ICML.

[15]  Xiaohua Zhai,et al.  Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval , 2013, AAAI.

[16]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[17]  Ivor W. Tsang,et al.  Learning with Augmented Features for Heterogeneous Domain Adaptation , 2012, ICML.

[18]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[19]  Nir Friedman,et al.  Being Bayesian about Network Structure , 2000, UAI.

[20]  Chang Wang,et al.  Heterogeneous Domain Adaptation Using Manifold Alignment , 2011, IJCAI.

[21]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[22]  W I V Benjamin Fortson,et al.  Indo-European Language and Culture: An Introduction , 2004 .

[23]  Núria Bel,et al.  Cross-Lingual Text Categorization , 2003, ECDL.

[24]  Mark Tygert,et al.  A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[25]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[26]  Yunming Ye,et al.  Cotransfer Learning Using Coupled Markov Chains with Restart , 2014, IEEE Intelligent Systems.

[27]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[28]  Maayan Harel,et al.  Learning from Multiple Outlooks , 2010, ICML.

[29]  See-Kiong Ng,et al.  Negative Training Data Can be Harmful to Text Classification , 2010, EMNLP.

[30]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[31]  Guy Shani,et al.  TALMUD: transfer learning for multiple domains , 2012, CIKM.

[32]  Qiang Yang,et al.  Cross Validation Framework to Choose amongst Models and Datasets for Transfer Learning , 2010, ECML/PKDD.

[33]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[34]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[35]  Ivor W. Tsang,et al.  Learning With Augmented Features for Supervised and Semi-Supervised Heterogeneous Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Kevin P. Murphy,et al.  Bayesian structure learning using dynamic programming and MCMC , 2007, UAI.

[37]  M. Ng,et al.  Co-transfer learning via joint transition probability graph based method , 2012, CDKD '12.

[38]  Michael K. Ng,et al.  An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[39]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[40]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[41]  David Heckerman,et al.  Learning Gaussian Networks , 1994, UAI.

[42]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[43]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.