Source-Selection-Free Transfer Learning

Transfer learning addresses the problems that labeled training data are insufficient to produce a high-performance model. Typically, given a target learning task, most transfer learning approaches require to select one or more auxiliary tasks as sources by the designers. However, how to select the right source data to enable effective knowledge transfer automatically is still an unsolved problem, which limits the applicability of transfer learning. In this paper, we take one step ahead and propose a novel transfer learning framework, known as source-selection-free transfer learning (SSFTL), to free users from the need to select source domains. Instead of asking the users for source and target data pairs, as traditional transfer learning does, SSFTL turns to some online information sources such as World Wide Web or the Wikipedia for help. The source data for transfer learning can be hidden somewhere within this large online information source, but the users do not know where they are. Based on the online information sources, we train a large number of classifiers. Then, given a target task, a bridge is built for labels of the potential source candidates and the target domain data in SSFTL via some large online social media with tag cloud as a label translator. An added advantage of SSFTL is that, unlike many previous transfer learning approaches, which are difficult to scale up to the Web scale, SSFTL is highly scalable and can offset much of the training work to offline stage. We demonstrate the effectiveness and efficiency of SSFTL through extensive experiments on several real-world datasets in text classification.

[1]  Benno Stein,et al.  Cross-Lingual Adaptation Using Structural Correspondence Learning , 2010, TIST.

[2]  Xindong Wu,et al.  Subkilometer crater discovery with boosting and transfer learning , 2011, TIST.

[3]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[4]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[5]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[6]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[7]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[8]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[9]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[10]  Alexander J. Smola,et al.  Multitask Learning without Label Correspondences , 2010, NIPS.

[11]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[12]  Ophir Frieder,et al.  Improving automatic query classification via semi-supervised learning , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[13]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[14]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[15]  Qiang Yang,et al.  Relaxed Transfer of Different Classes via Spectral Partition , 2009, ECML/PKDD.

[16]  Qiang Yang,et al.  EigenTransfer: a unified framework for transfer learning , 2009, ICML '09.

[17]  Kilian Q. Weinberger,et al.  Large Margin Taxonomy Embedding for Document Categorization , 2008, NIPS.