Augmenting Semantic Lexicons Using Word Embeddings and Transfer Learning

Sentiment-aware intelligent systems are essential to a wide array of applications. These systems are driven by language models which broadly fall into two paradigms: Lexicon-based and contextual. Although recent contextual models are increasingly dominant, we still see demand for lexicon-based models because of their interpretability and ease of use. For example, lexicon-based models allow researchers to readily determine which words and phrases contribute most to a change in measured sentiment. A challenge for any lexicon-based approach is that the lexicon needs to be routinely expanded with new words and expressions. Here, we propose two models for automatic lexicon expansion. Our first model establishes a baseline employing a simple and shallow neural network initialized with pre-trained word embeddings using a non-contextual approach. Our second model improves upon our baseline, featuring a deep Transformer-based network that brings to bear word definitions to estimate their lexical polarity. Our evaluation shows that both models are able to score new words with a similar accuracy to reviewers from Amazon Mechanical Turk, but at a fraction of the cost.

[1]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[2]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.

[3]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[4]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[5]  Songbo Tan,et al.  A survey on sentiment detection of reviews , 2009, Expert Syst. Appl..

[6]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[7]  Ramón Fernández Astudillo,et al.  INESC-ID: A Regression Model for Large Scale Twitter Sentiment Lexicon Induction , 2015, SemEval@NAACL-HLT.

[8]  Peter Sheridan Dodds,et al.  Generalized word shift graphs: a method for visualizing and explaining pairwise comparisons between texts , 2020, EPJ Data Science.

[9]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[10]  Christopher M. Danforth,et al.  Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter , 2011, PloS one.

[11]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[12]  Graph Convolutional Networks with Multi-headed Attention for Code-Mixed Sentiment Analysis , 2021, DRAVIDIANLANGTECH.

[13]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[14]  Yoshua Bengio,et al.  No Unbiased Estimator of the Variance of K-Fold Cross-Validation , 2003, J. Mach. Learn. Res..

[15]  C. M. Danforth,et al.  How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter , 2020, PloS one.

[16]  Geoff Hollis,et al.  Extrapolating human judgments from skip-gram vector representations of word meaning , 2017, Quarterly journal of experimental psychology.

[17]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[18]  German Rigau,et al.  Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages , 2014, EACL.

[19]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[20]  Anita Peti-Stantic,et al.  Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings , 2018, Rep4NLP@ACL.

[21]  George A. Miller,et al.  Length-Frequency Statistics for Written English , 1958, Inf. Control..

[22]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[23]  Norwati Mustapha,et al.  Effective Method for Sentiment Lexical Dictionary Enrichment Based on Word2Vec for Sentiment Analysis , 2018, 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP).

[24]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[25]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[26]  Ara Darzi,et al.  Sentiment Analysis of Health Care Tweets: Review of the Methods Used , 2018, JMIR public health and surveillance.

[27]  Loren Terveen,et al.  PHOAKS: a system for sharing recommendations , 1997, CACM.

[28]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[29]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30]  Shanliang Yang,et al.  Implicit sentiment analysis based on graph attention neural network , 2021, Engineering Reports.

[31]  Charles M. C. Lee,et al.  Retail Investor Sentiment and Return Comovements , 2005 .

[32]  Yunfei Long,et al.  Inferring Affective Meanings of Words from Word Embedding , 2017, IEEE Transactions on Affective Computing.

[33]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[34]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[35]  W. Marsden I and J , 2012 .

[36]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[37]  Thayer Alshaabi,et al.  Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter , 2020, Science Advances.

[38]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[39]  Ghazaleh Beigi,et al.  An Overview of Sentiment Analysis in Social Media and Its Applications in Disaster Relief , 2016, Sentiment Analysis and Ontology Engineering.

[40]  Chenhao Tan,et al.  On Positivity Bias in Negative Reviews , 2021, ACL.

[41]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[42]  Christopher M. Danforth,et al.  Sentiment analysis methods for understanding large-scale texts: a case for using continuum-scored words and word shift graphs , 2017, EPJ Data Science.

[43]  P. Robinson,et al.  Efficient Estimation of the , 2007 .

[44]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[45]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[46]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[47]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[48]  Erik Cambria,et al.  Developing a concept-level knowledge base for sentiment analysis in Singlish , 2016, CICLing.

[49]  M. Laver,et al.  Extracting Policy Positions from Political Texts Using Words as Data , 2003, American Political Science Review.

[50]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[51]  Lun-Wei Ku,et al.  Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing , 2021, NAACL.

[52]  ThelwallMike,et al.  Sentiment strength detection in short informal text , 2010 .

[53]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[54]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[55]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[56]  Kate Crawford,et al.  Halt the use of facial-recognition technology until it is regulated , 2019, Nature.

[57]  Mihaela Colhon,et al.  How Objective a Neutral Word Is? A Neutrosophic Approach for the Objectivity Degrees of Neutral Words , 2017, Symmetry.

[58]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[59]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[60]  Rachael Tatman,et al.  Gender and Dialect Bias in YouTube’s Automatic Captions , 2017, EthNLP@EACL.

[61]  Jure Leskovec,et al.  Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora , 2016, EMNLP.

[62]  Zhiyuan Liu,et al.  From Symbols to Embeddings: A Tale of Two Representations in Computational Social Science , 2021, J. Soc. Comput..

[63]  Osmar R. Zaïane,et al.  Current State of Text Sentiment Analysis from Opinion to Emotion Mining , 2017, ACM Comput. Surv..

[64]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[65]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[66]  Dirk Hovy,et al.  The Social Impact of Natural Language Processing , 2016, ACL.

[67]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[68]  James P. Bagrow,et al.  Human language reveals a universal positivity bias , 2014, Proceedings of the National Academy of Sciences.

[69]  Thayer Alshaabi,et al.  Quantifying language changes surrounding mental health on Twitter , 2021, ArXiv.

[70]  Mark Dredze,et al.  Quantifying Mental Health Signals in Twitter , 2014, CLPsych@ACL.

[71]  Mike Conway,et al.  Social Media, Big Data, and Mental Health: Current Advances and Ethical Implications. , 2016, Current opinion in psychology.

[72]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[73]  C. Osgood Studies on the generality of affective meaning systems. , 1962 .

[74]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[75]  P. Alam ‘L’ , 2021, Composites Engineering: An A–Z Guide.

[76]  K. Robert Lai,et al.  Community-Based Weighted Graph Model for Valence-Arousal Prediction of Affective Words , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[77]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[78]  Saif Mohammad,et al.  Obtaining Reliable Human Ratings of Valence, Arousal, and Dominance for 20,000 English Words , 2018, ACL.

[79]  Cornelia Herbert,et al.  Emotion, Etmnooi, or Emitoon? – Faster lexical access to emotional than to neutral words during reading , 2013, Biological Psychology.

[80]  Sophia Ananiadou,et al.  Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts , 2016, J. Biomed. Informatics.

[81]  Luís M. B. Cabral,et al.  The Dynamics of Seller Reputation: Evidence from Ebay , 2006 .

[82]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[83]  Moshe Koppel,et al.  THE IMPORTANCE OF NEUTRAL EXAMPLES FOR LEARNING SENTIMENT , 2006, Comput. Intell..

[84]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[85]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[86]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[87]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[88]  Ming Zhou,et al.  Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach , 2014, COLING.

[89]  Regina Barzilay,et al.  Multiple Aspect Ranking Using the Good Grief Algorithm , 2007, NAACL.

[90]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[91]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[92]  Yang Yu,et al.  The impact of social and conventional media on firm equity value: A sentiment analysis approach , 2013, Decis. Support Syst..

[93]  Krishna C. Bathina,et al.  Individuals with depression express more distorted thinking on social media , 2021, Nature Human Behaviour.

[94]  Christopher M. Danforth,et al.  Measuring the Happiness of Large-Scale Written Expression: Songs, Blogs, and Presidents , 2010, ArXiv.

[95]  Pengfei Wei,et al.  Multi-level graph neural network for text sentiment analysis , 2021, Comput. Electr. Eng..

[96]  Piotr Szymanski,et al.  Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis , 2015, Entropy.

[97]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[98]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[99]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[100]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[101]  Rémi Louf,et al.  Transformers : State-ofthe-art Natural Language Processing , 2019 .

[102]  Geoff Hollis,et al.  The principals of meaning: Extracting semantic dimensions from co-occurrence models of semantics , 2016, Psychonomic Bulletin & Review.

[103]  Sinnathamby Mahesan,et al.  Sentiment Lexicon Expansion using Word2vec and fastText for Sentiment Prediction in Tamil texts , 2020, 2020 Moratuwa Engineering Research Conference (MERCon).

[104]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[105]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[106]  Fabrício Benevenuto,et al.  A Benchmark Comparison of State-of-the-Practice Sentiment Analysis Methods , 2015, ArXiv.

[107]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[108]  M. S. Mayzner,et al.  Tables of single-letter and digram frequency counts for various word-length and letter-position combinations. , 1965 .

[109]  Nazlia Omar,et al.  Corpus-Based Techniques for Sentiment Lexicon Generation: A Review , 2019, J. Digit. Inf. Manag..

[110]  Matt Thomas,et al.  Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[111]  Erik Cambria,et al.  SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis , 2020, CIKM.

[112]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[113]  Faisal Muhammad Shah,et al.  Sentiment analysis on large scale Amazon product reviews , 2018, 2018 IEEE International Conference on Innovative Research and Development (ICIRD).

[114]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[115]  Trevor Paglen,et al.  Correction to: Excavating AI: the politics of images in machine learning training sets , 2021, AI & SOCIETY.

[116]  Ellen Riloff,et al.  An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains , 1996, Artif. Intell..

[117]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[118]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[119]  Keeheon Lee,et al.  The Computational Limits of Deep Learning , 2020, ArXiv.

[120]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[121]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.