TTI-TR-2009-1 Zero-Shot Domain Adaptation : A MultiView Approach

Domain adaptation algorithms attempt to address situations where our training (source) data distribution and test (target) data distribution differ, potentially by a substantial amount. For example, in a natural language processing task there may be many important phrases in our target genre which are required for low target error but do not occur in our source training set or even have support under the source domain’s distribution. This work provides a domain adaptation algorithm, which (provably) permits zero-shot learning — by this, we mean learning an accurate classifier on our target domain with only labeled data from our source domains (and no labeled data on the target domain). Furthermore, we give finite sample error bounds, showing how this zero-shot learning is possible even in the aforementioned NLP example. The key intuition we formalize is how to use these novel target-specific features via their correlation with those features that are present in both the source and target domains (this learning may be done with unlabeled data). Our experiments demonstrate the robust success of our algorithm for a variety of domain adaptation tasks on product review rating prediction across multiple product types.

[1]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[2]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[3]  Koby Crammer,et al.  Learning from Multiple Sources , 2006, NIPS.

[4]  Eugene Charniak,et al.  Reranking and Self-Training for Parser Adaptation , 2006, ACL.

[5]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[6]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[7]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[8]  Tong Zhang,et al.  Two-view feature generation model for semi-supervised learning , 2007, ICML '07.

[9]  Charles A. Micchelli,et al.  A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.

[10]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[11]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[12]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[13]  Koby Crammer,et al.  Learning Bounds for Domain Adaptation , 2007, NIPS.

[14]  Steffen Bickel,et al.  Discriminative learning for differing training and test distributions , 2007, ICML '07.

[15]  Larry P. Heck,et al.  Trada: tree based ranking function adaptation , 2008, CIKM '08.

[16]  Sham M. Kakade,et al.  Multi-View Dimensionality Reduction via Canonical Correlation Multi-View Dimensionality Reduction via Canonical Correlation Analysis Analysis Multi-View Dimensionality Reduction via Canonical Correlation Analysis Multi-View Dimensionality Reduction via Canonical Correlation Analysis Multi-View Dimen , 2008 .

[17]  Mehryar Mohri,et al.  Sample Selection Bias Correction Theory , 2008, ALT.

[18]  Qian Liu,et al.  Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction , 2008, Bioinform..

[19]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.