Enriching Word Embeddings with a Regressor Instead of Labeled Corpora

We propose a novel method for enriching word-embeddings without the need of a labeled corpus. Instead, we show that relying on a regressor – trained with a small lexicon to predict pseudo-labels – significantly improves performance over current techniques that rely on human-derived sentence-level labels for an entire corpora. Our approach enables enrichment for corpora that have no labels (such as Wikipedia). Exploring the utility of this general approach in both sentiment and non-sentiment-focused tasks, we show how enriching embeddings, for both Twitter and Wikipedia-based embeddings, provide notable improvements in performance for: binary sentiment classification, SemEval Tasks, embedding analogy task, and, document classification. Importantly, our approach is notably better and more generalizable than other state-of-the-art approaches for enriching both labeled and unlabeled corpora.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Magnus Sahlgren,et al.  The Distributional Hypothesis , 2008 .

[3]  Graeme Hirst,et al.  Cross-Lingual Sentiment Analysis Without (Good) Translation , 2017, IJCNLP.

[4]  G. Frege On Sense and Reference , 1948 .

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  Wojciech Czarnecki,et al.  How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks , 2017, ArXiv.

[7]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[8]  Yue Zhang,et al.  Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings , 2016, AAAI.

[9]  John A Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD , 2012, Behavior Research Methods.

[10]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[11]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[12]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[15]  Magnus Sahlgren,et al.  An Introduction to Random Indexing , 2005 .

[16]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[17]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[18]  Tommi S. Jaakkola,et al.  Word Embeddings as Metric Recovery in Semantic Spaces , 2016, TACL.

[19]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[20]  Zhihua Zhang,et al.  Three Convolutional Neural Network-based models for learning Sentiment Word Vectors towards sentiment analysis , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[21]  Amy Beth Warriner,et al.  Norms of valence, arousal, and dominance for 13,915 English lemmas , 2013, Behavior Research Methods.

[22]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[23]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.