Target Contrastive Pessimistic Discriminant Analysis

Domain-adaptive classifiers learn from a source domain and aim to generalize to a target domain. If the classifier's assumptions on the relationship between domains (e.g. covariate shift) are valid, then it will usually outperform a non-adaptive source classifier. Unfortunately, it can perform substantially worse when its assumptions are invalid. Validating these assumptions requires labeled target samples, which are usually not available. We argue that, in order to make domain-adaptive classifiers more practical, it is necessary to focus on robust methods; robust in the sense that the model still achieves a particular level of performance without making strong assumptions on the relationship between domains. With this objective in mind, we formulate a conservative parameter estimator that only deviates from the source classifier when a lower or equal risk is guaranteed for all possible labellings of the given target samples. We derive the corresponding estimator for a discriminant analysis model, and show that its risk is actually strictly smaller than that of the source classifier. Experiments indicate that our classifier outperforms state-of-the-art classifiers for geographically biased samples.

[1]  Russell Greiner,et al.  Robust Learning under Uncertain Test Distributions: Relating Covariate Shift to Model Misspecification , 2014, ICML.

[2]  Mehryar Mohri,et al.  Sample Selection Bias Correction Theory , 2008, ALT.

[3]  Graham J. Williams Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery , 2011 .

[4]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[5]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[6]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[7]  Mehryar Mohri,et al.  Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..

[8]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[9]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[10]  Laurent Condat Fast projection onto the simplex and the l1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pmb {l}_\mathbf {1}$$\end{ , 2015, Mathematical Programming.

[11]  Marco Loog,et al.  Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[13]  Brian D. Ziebart,et al.  Robust Classification Under Sample Selection Bias , 2014, NIPS.

[14]  Steffen Bickel,et al.  Discriminative Learning Under Covariate Shift , 2009, J. Mach. Learn. Res..

[15]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[16]  ASHISH CHERUKURI,et al.  Saddle-Point Dynamics: Conditions for Asymptotic Stability of Saddle Points , 2015, SIAM J. Control. Optim..