Chapter 20 Generating , Refining and Using Sentiment Lexicons

In this chapter, which is based on [7–9], we report on work on the generation, refinement and use of sentiment lexicons that was carried out within the DuOMAn project. The project was focused on the development of language technology to support online media analysis. In the area of media analysis, one of the key tasks is collecting detailed information about opinions and attitudes toward specific topics from various sources, both offline (traditional newspapers, archives) and online (news sites, blogs, forums). Specifically, media analysis concerns the following system task: given a topic and list of documents (discussing the topic), find all instances of attitudes toward the topic (e.g., positive/negative sentiments, or, if the topic is an organisation or person, support/criticism of this entity). For every such instance, one should identify the source of the sentiment, the polarity and, possibly, subtopics that this attitude relates to (e.g., specific targets of criticism or support). Subsequently, a (human) media analyst must be able to aggregate the extracted information by source, polarity or subtopics, allowing him to build support/criticism networks etc. [1]. Recent advances in language technology, especially in sentiment analysis, promise to (partially) automate this task. Sentiment analysis is often considered in the context of the following two tasks:

[1]  Valentin Jijkoun,et al.  Mining User Experiences from Online Forums: An Exploration , 2010, HLT-NAACL 2010.

[2]  Valentin Jijkoun,et al.  Generating Focused Topic-Specific Sentiment Lexicons , 2010, ACL.

[3]  M. de Rijke,et al.  Credibility Improves Topical Blog Post Retrieval , 2008, ACL.

[4]  Gilad Mishne,et al.  Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels , 2006, EACL.

[5]  J. Wiebe,et al.  Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009 .

[6]  Jungi Kim,et al.  KLE at TREC 2008 Blog Track: Blog Post and Feed Retrieval , 2008, TREC.

[7]  Sung-Hyon Myaeng,et al.  Domain-specific sentiment analysis using contextual feature generation , 2009, TSA@CIKM.

[8]  Craig MacDonald,et al.  Overview of the TREC 2006 Blog Track , 2006, TREC.

[9]  Valentin Jijkoun,et al.  Bootstrapping subjectivity detection , 2011, SIGIR '11.

[10]  Steven Skiena,et al.  Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.

[11]  Jong-Hyeok Lee,et al.  Improving Opinion Retrieval Based on Query-Specific Sentiment Lexicon , 2009, ECIR.

[12]  M. de Rijke,et al.  Parsimonious relevance models , 2008, SIGIR '08.

[13]  Angela Fahrni,et al.  Old Wine or Warm Beer : Target-Specific Sentiment Analysis of Adjectives , .

[14]  Hsin-Hsi Chen,et al.  Overview of Opinion Analysis Pilot Task at NTCIR-6 , 2007, NTCIR.

[15]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[16]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[17]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[18]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[19]  Hiroshi Kanayama,et al.  Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis , 2006, EMNLP.

[20]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[21]  Maarten de Rijke,et al.  A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections , 2009, ACL/IJCNLP.

[22]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[23]  Sung-Hyon Myaeng,et al.  Generating Domain-Specific Clues Using News Corpus for Sentiment Classification , 2010, ICWSM.

[24]  Iadh Ounis,et al.  The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection , 2006 .

[25]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[26]  David L. Altheide Qualitative Media Analysis , 1996 .

[27]  Bill Buxton,et al.  Sketching User Experiences: Getting the Design Right and the Right Design , 2007 .

[28]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[29]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.