Domain Adaptation Problems: A DASVM Classification Technique and a Circular Validation Strategy

This paper addresses pattern classification in the framework of domain adaptation by considering methods that solve problems in which training data are assumed to be available only for a source domain different (even if related) from the target domain of (unlabeled) test data. Two main novel contributions are proposed: 1) a domain adaptation support vector machine (DASVM) technique which extends the formulation of support vector machines (SVMs) to the domain adaptation framework and 2) a circular indirect accuracy assessment strategy for validating the learning of domain adaptation classifiers when no true labels for the target--domain instances are available. Experimental results, obtained on a series of two-dimensional toy problems and on two real data sets related to brain computer interface and remote sensing applications, confirmed the effectiveness and the reliability of both the DASVM technique and the proposed circular validation strategy.

[1]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[2]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[3]  Conrad V. Kufta,et al.  Event-related desynchronization and movement-related cortical potentials on the ECoG and EEG. , 1994, Electroencephalography and clinical neurophysiology.

[4]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[5]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[7]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[8]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[9]  Patrick Berg,et al.  Common spatial subspace decomposition applied to analysis of brain responses under multiple task conditions: a simulation study , 1999, Clinical Neurophysiology.

[10]  Rebecca Hwa Supervised Grammar Induction using Training Data with Limited Constituent Information , 1999, ACL.

[11]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[12]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[13]  Febo Cincotti,et al.  Human Movement-Related Potentials vs Desynchronization of EEG Alpha Rhythm: A High-Resolution EEG Study , 1999, NeuroImage.

[14]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[15]  David A. Landgrebe,et al.  Robust parameter estimation for mixture model , 2000, IEEE Trans. Geosci. Remote. Sens..

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Lorenzo Bruzzone,et al.  Unsupervised retraining of a maximum likelihood classifier for the analysis of multitemporal remote sensing images , 2001, IEEE Trans. Geosci. Remote. Sens..

[18]  Daniel Gildea,et al.  Corpus Variation and Parser Performance , 2001, EMNLP.

[19]  O. Mangasarian,et al.  Semi-Supervised Support Vector Machines for Unlabeled Data Classification , 2001 .

[20]  Hang Joon Kim,et al.  Support Vector Machines for Texture Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Lorenzo Bruzzone,et al.  A multiple-cascade-classifier system for a robust and partially unsupervised updating of land-cover maps , 2002, IEEE Trans. Geosci. Remote. Sens..

[22]  Lorenzo Bruzzone,et al.  A partially unsupervised cascade classifier for the analysis of multitemporal remote-sensing images , 2002, Pattern Recognit. Lett..

[23]  Lorenzo Bruzzone,et al.  Combining parametric and non-parametric algorithms for a partially unsupervised classification of multitemporal remote-sensing images , 2002, Inf. Fusion.

[24]  Guoping Wang,et al.  Learning with progressive transductive Support Vector Machine , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[25]  Gunnar Rätsch,et al.  Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[27]  Shai Ben-David,et al.  Exploiting Task Relatedness for Mulitple Task Learning , 2003, COLT.

[28]  Brian Roark,et al.  Supervised and unsupervised PCFG adaptation to novel domains , 2003, NAACL.

[29]  Xiaoqiang Luo,et al.  A Statistical Model for Multilingual Entity Detection and Tracking , 2004, NAACL.

[30]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[31]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[32]  Bernhard Schölkopf,et al.  Methods Towards Invasive Human Brain Computer Interfaces , 2004, NIPS.

[33]  Amos Storkey,et al.  Advances in Neural Information Processing Systems 20 , 2007 .

[34]  Masashi Sugiyama,et al.  Input-dependent estimation of generalization error under covariate shift , 2005 .

[35]  Miroslav Dudík,et al.  Correcting sample selection bias in maximum entropy density estimation , 2005, NIPS.

[36]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[37]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[38]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[39]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[40]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[41]  Sunita Sarawagi,et al.  Domain Adaptation of Conditional Probability Models Via Feature Subsetting , 2007, PKDD.

[42]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[43]  Xiao Li,et al.  A Bayesian Divergence Prior for Classiffier Adaptation , 2007, AISTATS.

[44]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[45]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.