Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization

Contrastive learning (CL) has been successful as a powerful representation learning method. In this work we propose CLIM: Contrastive Learning with mutual Information Maximization, to explore the potential of CL on cross-domain sentiment classification. To the best of our knowledge, CLIM is the first to adopt contrastive learning for natural language processing (NLP) tasks across domains. Due to scarcity of labels on the target domain, we introduce mutual information maximization (MIM) apart from CL to exploit the features that best support the final prediction. Furthermore, MIM is able to maintain a relatively balanced distribution of the model's prediction, and enlarges the margin between classes on the target domain. The larger margin increases our model's robustness and enables the same classifier to be optimal across domains. Consequently, we achieve new state-of-the-art results on the Amazon-review dataset as well as the airlines dataset, showing the efficacy of our proposed method CLIM.

[1]  Enhong Chen,et al.  Interactive Attention Transfer Network for Cross-Domain Sentiment Classification , 2018 .

[2]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[3]  Hwee Tou Ng,et al.  Adaptive Semi-supervised Learning for Cross-domain Sentiment Classification , 2018, EMNLP.

[4]  Frank Hutter,et al.  Fixing Weight Decay Regularization in Adam , 2017, ArXiv.

[5]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[6]  Geoffrey J. Gordon,et al.  Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift , 2020, NeurIPS.

[7]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[8]  Firoj Alam,et al.  Domain Adaptation with Adversarial Training and Graph Embeddings , 2018, ACL.

[9]  Gary D. Bader,et al.  DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations , 2020, ACL.

[10]  Pengtao Xie,et al.  CERT: Contrastive Self-supervised Learning for Language Understanding , 2020 .

[11]  Andrew M. Dai,et al.  Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step , 2017, ICLR.

[12]  Kurt Keutzer,et al.  Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Gabriela Csurka,et al.  A Domain Adaptation Regularization for Denoising Autoencoders , 2016, ACL.

[14]  Pengtao Xie,et al.  CERT: Contrastive Self-supervised Learning for Language Understanding , 2020, ArXiv.

[15]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[16]  Roi Reichart,et al.  Pivot Based Language Modeling for Improved Neural Domain Adaptation , 2018, NAACL.

[17]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jingyu Wang,et al.  Adversarial and Domain-Aware BERT for Cross-Domain Sentiment Analysis , 2020, ACL.

[19]  Yu Zhang,et al.  Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification , 2018, AAAI.

[20]  Jianfei Yu,et al.  Learning Sentence Embeddings with Auxiliary Tasks for Cross-Domain Sentiment Classification , 2016, EMNLP.

[21]  Tong Che,et al.  Rethinking Distributional Matching Based Domain Adaptation , 2020, ArXiv.

[22]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.