Featuring, Detecting, and Visualizing Human Sentiment in Chinese Micro-Blog

Micro-blog has been increasingly used for the public to express their opinions, and for organizations to detect public sentiment about social events or public policies. In this article, we examine and identify the key problems of this field, focusing particularly on the characteristics of innovative words, multi-media elements, and hierarchical structure of Chinese “Weibo.” Based on the analysis, we propose a novel approach and develop associated theoretical and technological methods to address these problems. These include a new sentiment word mining method based on three wording metrics and point-wise information, a rule set model for analyzing sentiment features of different linguistic components, and the corresponding methodology for calculating sentiment on multi-granularity considering emoticon elements as auxiliary affective factors. We evaluate our new word discovery and sentiment detection methods on a real-life Chinese micro-blog dataset. Initial results show that our new diction can improve sentiment detection, and they demonstrate that our multi-level rule set method is more effective, with the average accuracy being 10.2% and 1.5% higher than two existing methods for Chinese micro-blog sentiment analysis. In addition, we exploit visualization techniques to study the relationships between online sentiment and real life. The visualization of detected sentiment can help depict temporal patterns and spatial discrepancy.

[1]  Hsin-Hsi Chen,et al.  Mining opinions from the Web: Beyond relevance retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[2]  Xuanjing Huang,et al.  Phrase Dependency Parsing for Opinion Mining , 2009, EMNLP.

[3]  Xinying Xu,et al.  Hidden sentiment association in chinese web opinion mining , 2008, WWW.

[4]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[5]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[6]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[7]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[8]  Mitsuru Ishizuka,et al.  SentiFul: A Lexicon for Sentiment Analysis , 2011, IEEE Transactions on Affective Computing.

[9]  Prem Melville,et al.  Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[10]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[11]  Yi Su,et al.  The Chinese Bag-of-Opinions Method for Hot-Topic-Oriented Sentiment Analysis on Weibo , 2012, CSWS.

[12]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[13]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[14]  Zhu Wang,et al.  Discovering Information Propagation Patterns in Microblogging Services , 2015, ACM Trans. Knowl. Discov. Data.

[15]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[16]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[17]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[18]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[19]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[20]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[21]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[22]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[23]  Long Jiang,et al.  User-level sentiment analysis incorporating social networks , 2011, KDD.

[24]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[25]  Xingshe Zhou,et al.  Predicting the content dissemination trends by repost behavior modeling in mobile social networks , 2014, J. Netw. Comput. Appl..

[26]  Wanxiang Che,et al.  Appraisal Expression Recognition with Syntactic Path for Sentence Sentiment Classification , 2011, Int. J. Comput. Process. Orient. Lang..

[27]  Walaa Medhat,et al.  Combined Algorithm for Data Mining using Association rules , 2014, ArXiv.

[28]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[29]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[30]  Koichi Mori,et al.  Nokia internet pulse: a long term deployment and iteration of a twitter visualization , 2012, CHI EA '12.

[31]  Bo Yuan,et al.  Sentiment Classification in Chinese Microblogs: Lexicon-based and Learning-based Approaches , 2013 .

[32]  Yiqun Liu,et al.  Lexicon-Based Sentiment Analysis on Topical Chinese Microblog Messages , 2012, CSWS.

[33]  Ke Xu,et al.  MoodLens: an emoticon-based sentiment analysis system for chinese tweets , 2012, KDD.