Domain Adapted Word Embeddings for Improved Sentiment Classification

Generic word embeddings are trained on large-scale generic corpora; Domain Specific (DS) word embeddings are trained only on data from a domain of interest. This paper proposes a method to combine the breadth of generic embeddings with the specificity of domain specific embeddings. The resulting embeddings, called Domain Adapted (DA) word embeddings, are formed by aligning corresponding word vectors using Canonical Correlation Analysis (CCA) or the related nonlinear Kernel CCA. Evaluation results on sentiment classification tasks show that the DA embeddings substantially outperform both generic and DS embeddings when used as input features to standard or state-of-the-art sentence encoding algorithms for classification.

[1]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[2]  Dan Roth,et al.  Cross-lingual Wikification Using Multilingual Embeddings , 2016, NAACL.

[3]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[4]  Misha Denil,et al.  From Group to Individual Labels Using Deep Features , 2015, KDD.

[5]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[6]  Dean P. Foster,et al.  Two Step CCA: A new spectral method for estimating vector models of words , 2012, ICML 2012.

[7]  Manaal Faruqui,et al.  Improving Vector Space Word Representations Using Multilingual Correlation , 2014, EACL.

[8]  Kevin Gimpel,et al.  Deep Multilingual Correlation for Improved Word Embeddings , 2015, NAACL.

[9]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Wenpeng Yin,et al.  Learning Word Meta-Embeddings , 2016, ACL.

[11]  Yong Luo,et al.  Pre-Trained Multi-View Word Embedding Using Two-Side Neural Network , 2014, AAAI.

[12]  John Torous,et al.  Can smartphone mental health interventions reduce symptoms of anxiety? A meta-analysis of randomized controlled trials. , 2017, Journal of affective disorders.

[13]  Felix Hill,et al.  Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.

[14]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Ming-Yuan Chih,et al.  Mobile Delivery of Treatment for Alcohol Use Disorders , 2014, Alcohol research : current reviews.

[16]  Jack L. Gallant,et al.  Pyrcca: Regularized Kernel Canonical Correlation Analysis in Python and Its Applications to Neuroimaging , 2015, Front. Neuroinform..

[17]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[18]  John P. Cunningham,et al.  Bayesian Learning of Kernel Embeddings , 2016, UAI.

[19]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[20]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[21]  Johan A. K. Suykens,et al.  Regularized Semipaired Kernel CCA for Domain Adaptation , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[23]  Stefan Winkler,et al.  On the utility of canonical correlation analysis for domain adaptation in multi-view headpose estimation , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[24]  Lisa A. Marsch,et al.  Emerging mHealth and eHealth interventions for serious mental illness: a review of the literature , 2015, Journal of mental health.

[25]  Erika B. Litvin,et al.  Computer and mobile technology-based interventions for substance use disorders: an organizing framework. , 2013, Addictive behaviors.

[26]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[28]  John Blitzer,et al.  Domain Adaptation with Coupled Subspaces , 2011, AISTATS.

[29]  Yoshua Bengio,et al.  BilBOWA: Fast Bilingual Distributed Representations without Word Alignments , 2014, ICML.