Sentiment analysis for various SNS media using Naïve Bayes classifier and its application to flaming detection

SNS is one of the most effective communication tools and it has brought about drastic changes in our lives. Recently, however, a phenomenon called flaming or backlash becomes an imminent problem to private companies. A flaming incident is usually triggered by thoughtless comments/actions on SNS, and it sometimes ends up damaging to the company's reputation seriously. In this paper, in order to prevent such unexpected damage to the company's reputation, we propose a new approach to sentiment analysis using a Naïve Bayes classifier, in which the features of tweets/comments are selected based on entropy-based criteria and an empirical rule to capture negative expressions. In addition, we propose a semi-supervised learning approach to relabeling noisy training data, which come from various SNS media such as Twitter, Facebook, blogs and a Japanese textboard called `2-channel'. In the experiments, we use four data sets of users' comments, which were posted to different SNS media of private companies. The experimental results show that the proposed Naïve Bayes classifier model has good performance for different SNS media, and a semi-supervised learning effectively works for the data consisting of long comments. In addition, the proposed method is applied to detect flaming incidents, and we show that it is successfully detected.

[1]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[2]  Gulden Uchyigit,et al.  Sentimentor: Sentiment Analysis of Twitter Data , 2012, SDAD@ECML/PKDD.

[3]  Prem Melville,et al.  Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[4]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[5]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[6]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[7]  Takashi Inui,et al.  Extracting Semantic Orientations of Words using Spin Model , 2005, ACL.

[8]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[9]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[10]  Mohamed M. Mostafa,et al.  More than words: Social networks' text mining for consumer brand sentiments , 2013, Expert Syst. Appl..

[11]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.