Leveraging sentiment analysis for topic detection

The emergence of new social media such as blogs, message boards, news, and web content in general has dramatically changed the ecosystems of corporations. Consumers, non-profit organizations, and other forms of communities are extremely vocal about their opinions and perceptions on companies and their brands on the web. The ability to leverage such “voice of the web” to gain consumer, brand, and market insights can be truly differentiating and valuable to today's corporations. In particular, one important form of insights can be derived from sentiment analysis on web content. Sentiment analysis traditionally emphasizes on classification of web comments into positive, neutral, and negative categories. This paper goes beyond sentiment classification by focusing on techniques that could detect the topics that are highly correlated with the positive and negative opinions. Such techniques, when coupled with sentiment classification, can help the business analysts to understand both the overall sentiment scope as well as the drivers behind the sentiment. In this paper, we describe our overall sentiment analysis system that consists of such sentiment analysis techniques, including the bootstrapping method for word polarities weighting, automatic filtering and expansion for domain word, and a sentiment classification method. We then detail a novel topic detection method using point-wise mutual information and term frequency distribution. We demonstrate the effectiveness of our overall approaches via several case studies on different social media data sets.

[1]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[2]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[3]  Henry Lieberman,et al.  A model of textual affect sensing using real-world knowledge , 2003, IUI '03.

[4]  Michael D. Smith,et al.  Predicting the Political Sentiment of Web Log Posts Using Supervised Machine Learning Techniques Coupled with Feature Selection , 2006, WEBKDD.

[5]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[6]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[7]  W. Scott Spangler,et al.  Business Insights Workbench - An Interactive Insights Discovery Solution , 2007, HCI.

[8]  Vincent Ng,et al.  Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews , 2006, ACL.

[9]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[10]  Pero Subasic,et al.  Affect analysis of text using fuzzy semantic typing , 2001, IEEE Trans. Fuzzy Syst..

[11]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[12]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[13]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[14]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[15]  J. Kamps,et al.  Words with attitude , 2002 .