Learning Domain-specific Sentiment Lexicon with Supervised Sentiment-aware LDA

Analyzing and understanding people's sentiments towards different topics has become an interesting task due to the explosion of opinion-rich resources. In most sentiment analysis applications, sentiment lexicons play a crucial role, to be used as metadata of sentiment polarity. However, most previous works focus on discovering general-purpose sentiment lexicons. They cannot capture domain-specific sentiment words, or implicit and connotative sentiment words that are seemingly objective. In this paper, we propose a supervised sentiment-aware LDA model (ssLDA). The model uses a minimal set of domain-independent seed words and document labels to discover a domain-specific lexicon, learning a lexicon much richer and adaptive to the sentiment of specific document. Experiments on two publicly-available datasets (movie reviews and Obama-McCain debate dataset) show that our model is effective in constructing a comprehensive and high-quality domain-specific sentiment lexicon. Furthermore, the resulting lexicon significantly improves the performance of sentiment classification tasks.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[3]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[4]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[5]  Yejin Choi,et al.  Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning , 2013, ACL.

[6]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[7]  Pushpak Bhattacharyya,et al.  Cross-Lingual Sentiment Analysis for Indian Languages using Linked WordNets , 2012, COLING.

[8]  Yue Lu,et al.  Automatic construction of a context-aware sentiment lexicon: an optimization approach , 2011, WWW.

[9]  Sabine Bergler,et al.  When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging , 2008, ACL.

[10]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[11]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[12]  Michael L. Littman,et al.  Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus , 2002, ArXiv.

[13]  Mitsuru Ishizuka,et al.  SentiFul: Generating a reliable lexicon for sentiment analysis , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[14]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[15]  Claire Cardie,et al.  Adapting a Polarity Lexicon using Integer Linear Programming for Domain-Specific Sentiment Classification , 2009, EMNLP.

[16]  Alistair Kennedy,et al.  SENTIMENT CLASSIFICATION of MOVIE REVIEWS USING CONTEXTUAL VALENCE SHIFTERS , 2006, Comput. Intell..

[17]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[20]  Sasha Blair-Goldensohn,et al.  The viability of web-derived polarity lexicons , 2010, NAACL.

[21]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[22]  Chunping Li,et al.  Lexicon construction: A topic model approach , 2012, 2012 International Conference on Systems and Informatics (ICSAI2012).

[23]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[24]  David A. Shamma,et al.  Tweet the debates: understanding community annotation of uncollected sources , 2009, WSM@MM.

[25]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[26]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[27]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[28]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.