Testing exchangeability for transfer decision

A novel solution to the problem whether to transfer based on exchangeablity test.Statistically testing if the source data is generated from the target distribution.The test is non-parametric and distribution free.Empirically justified the proposed test is effective for predicting transfer result. This paper introduces a non-parametric test to decide whether to transfer data from a source domain to a target domain to improve the generalization performance of predictive models on the target domain. The test is based on the conformal prediction framework: it statistically tests whether the target and source data are generated from the same distribution under the exchangeability assumption. The experiments show that the test is capable of outperforming existing methods when it decides on instance transfer.

[1]  Vladimir Vovk,et al.  Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications , 2014 .

[2]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[3]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[4]  A. Church On the concept of a random sequence , 1940 .

[5]  Qiang Yang,et al.  Cross Validation Framework to Choose amongst Models and Datasets for Transfer Learning , 2010, ECML/PKDD.

[6]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  D. Aldous Exchangeability and related topics , 1985 .

[8]  C. A. Murthy,et al.  Pattern Recognition Letters Pattern classification with genetic algorithms , 2003 .

[9]  Shotaro Akaho,et al.  TrBagg: A Simple Transfer Learning Method and its Application to Personalization in Collaborative Tagging , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[10]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[11]  Ralf Peeters,et al.  Conformal Region Classification with Instance-Transfer Boosting , 2015, Int. J. Artif. Intell. Tools.

[12]  Vladimir Vovk,et al.  Transductive conformal predictors , 2015, AIAI.

[13]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.

[14]  E. Lehmann,et al.  Nonparametrics: Statistical Methods Based on Ranks , 1976 .

[15]  Alexander Gammerman,et al.  Plug-in martingales for testing exchangeability on-line , 2012, ICML.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[18]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[19]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[20]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[21]  Marie Schmidt,et al.  Nonparametrics Statistical Methods Based On Ranks , 2016 .

[22]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[23]  G. Shafer,et al.  Algorithmic Learning in a Random World , 2005 .

[24]  Alexander Gammerman,et al.  Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.

[25]  Jian Zhang,et al.  Double-bootstrapping source data selection for instance-based transfer learning , 2013, Pattern Recognit. Lett..

[26]  Chandan K. Reddy,et al.  Adaptive Boosting for Transfer Learning Using Dynamic Updates , 2011, ECML/PKDD.

[27]  Qiang Yang,et al.  Transitive Transfer Learning , 2015, KDD.

[28]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.