A Literature Survey on Domain Adaptation of Statistical Classifiers

The domain adaptation problem, especially domain adaptation in natural language processing, started gaining much attention very recently [Daumé III and Marcu, 2006, Blitzer et al., 2006, Ben-David et al., 2007, Daumé III, 2007, Satpal and Sarawagi, 2007]. However, some special kinds of domain adaptation problems have been studied before under different names such as class imbalance [Japkowicz and Stephen, 2002], covariate shift [Shimodaira, 2000], and sample selection bias [Heckman, 1979]. There are also some well-studied machine learning problems that are closely related but not equivalent to domain adaptation, including multi-task learning [Caruana, 1997] and semi-supervised learning [Chapelle et al., 2006]. In this literature survey, we review existing work in both the machine learning and the natural language processing communities related to domain adaptation. Because this relatively new topic is constantly attracting attention, our survey is necessarily incomplete. Nevertheless, we try to cover the major lines of work that we are aware of up to the date this survey is written. This survey will also be constantly updated. The goal of this literature survey is twofold. First, existing studies on domain adaptation seem very different from each other, and different terms are used to refer to the problem. There has not been any survey that connects these different studies. This survey thus tries to organize the existing work in a systematic way and draw a big picture of the domain adaptation problem with its possible solutions. Second, a systematic literature survey shows the limitations of current work and points out promising directions that should be explored.

[1]  J. Heckman Sample selection bias as a specification error , 1979 .

[2]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[3]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[4]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  JapkowiczNathalie,et al.  The class imbalance problem: A systematic study , 2002 .

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[9]  T. Ben-David,et al.  Exploiting Task Relatedness for Multiple , 2003 .

[10]  Charles A. Micchelli,et al.  Kernels for Multi--task Learning , 2004, NIPS.

[11]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[12]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[13]  Yi Lin,et al.  Support Vector Machines for Classification in Nonstandard Situations , 2002, Machine Learning.

[14]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[15]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[16]  Masashi Sugiyama,et al.  Input-dependent estimation of generalization error under covariate shift , 2005 .

[17]  Hwee Tou Ng,et al.  Word Sense Disambiguation with Distribution Estimation , 2005, IJCAI.

[18]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[19]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[20]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[21]  Masashi Sugiyama,et al.  Mixture Regression for Covariate Shift , 2006, NIPS.

[22]  Hwee Tou Ng,et al.  Estimating Class Priors in Domain Adaptation for Word Sense Disambiguation , 2006, ACL.

[23]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[24]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[25]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[26]  Steffen Bickel,et al.  Dirichlet-Enhanced Spam Filtering based on Biased Samples , 2006, NIPS.

[27]  Yong Yu,et al.  Bridged Refinement for Transfer Learning , 2007, PKDD.

[28]  Sunita Sarawagi,et al.  Domain Adaptation of Conditional Probability Models Via Feature Subsetting , 2007, PKDD.

[29]  Xiao Li,et al.  A Bayesian Divergence Prior for Classiffier Adaptation , 2007, AISTATS.

[30]  ChengXiang Zhai,et al.  A two-stage approach to domain adaptation for statistical classifiers , 2007, CIKM '07.

[31]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[32]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[33]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[34]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[35]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.

[36]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[37]  Koby Crammer,et al.  Learning Bounds for Domain Adaptation , 2007, NIPS.

[38]  Steffen Bickel,et al.  Discriminative learning for differing training and test distributions , 2007, ICML '07.

[39]  Jingbo Zhu,et al.  Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem , 2007, EMNLP.