论文信息 - TTI-TR-2009-1 Zero-Shot Domain Adaptation : A MultiView Approach

TTI-TR-2009-1 Zero-Shot Domain Adaptation : A MultiView Approach

Domain adaptation algorithms attempt to address situations where our training (source) data distribution and test (target) data distribution differ, potentially by a substantial amount. For example, in a natural language processing task there may be many important phrases in our target genre which are required for low target error but do not occur in our source training set or even have support under the source domain’s distribution. This work provides a domain adaptation algorithm, which (provably) permits zero-shot learning — by this, we mean learning an accurate classifier on our target domain with only labeled data from our source domains (and no labeled data on the target domain). Furthermore, we give finite sample error bounds, showing how this zero-shot learning is possible even in the aforementioned NLP example. The key intuition we formalize is how to use these novel target-specific features via their correlation with those features that are present in both the source and target domains (this learning may be done with unlabeled data). Our experiments demonstrate the robust success of our algorithm for a variety of domain adaptation tasks on product review rating prediction across multiple product types.

Dean P. Foster | John Blitzer

[1] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[2] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[3] Koby Crammer,et al. Learning from Multiple Sources , 2006, NIPS.

[4] Eugene Charniak,et al. Reranking and Self-Training for Parser Adaptation , 2006, ACL.

[5] Koby Crammer,et al. Analysis of Representations for Domain Adaptation , 2006, NIPS.

[6] Bernhard Schölkopf,et al. Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[7] Sham M. Kakade,et al. Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[8] Tong Zhang,et al. Two-view feature generation model for semi-supervised learning , 2007, ICML '07.

[9] Charles A. Micchelli,et al. A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.

[10] Qiang Yang,et al. Boosting for transfer learning , 2007, ICML '07.

[11] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.