A multi-granularity fuzzy computing model for sentiment classification of Chinese reviews

With the rapid growth of user-generated contents online, unsupervised methods which do not need to use labeled training data have become increasingly important in sentiment classification. But the performance of unsupervised methods is unsatisfactory. This is because sentence structure and ambiguity of sentiment intensity are usually ignored in existing unsupervised methods. To address these problems, we propose a multi-granularity fuzzy computing model which involves two innovations. Firstly, we come up with a multi-granularity computing method to compute sentiment intensity of reviews. To be specific, we deconstruct those reviews into three levels of language units—words, phrases and sentences, and consequently manage to compute the sentiment intensity of reviews by combining rule-based methods and statistic-based methods. Secondly, a fuzzy classifier is constructed to solve the ambiguity of sentiment intensity. Furthermore, two different selfsupervised methods using pseudo-labeled training data are proposed to learn the optimum parameters of the fuzzy classifier. Experimental results in four different datasets prove that our model improves 6.25% more accuracy on average than the competitive baselines in sentiment classification of Chinese reviews.

[1]  Xin Wang,et al.  Chinese Sentence-Level Sentiment Classification Based on Fuzzy Sets , 2010, COLING.

[2]  Yongfeng Huang,et al.  A Fuzzy Computing Model for Identifying Polarity of Chinese Sentiment Words , 2015, Comput. Intell. Neurosci..

[3]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[4]  Hai Wang,et al.  Predicting consumer sentiments using online sequential extreme learning machine and intuitionistic fuzzy sets , 2013, Neural Computing and Applications.

[5]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[6]  Clement T. Yu,et al.  Construction of a sentimental word dictionary , 2010, CIKM '10.

[7]  Yongfeng Huang,et al.  Short text classification based on strong feature thesaurus , 2012, Journal of Zhejiang University SCIENCE C.

[8]  Jin Zhang,et al.  An empirical study of sentiment analysis for chinese documents , 2008, Expert Syst. Appl..

[9]  Chunyu Kit,et al.  Chinese word segmentation as morpheme-based lexical chunking , 2008, Inf. Sci..

[10]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[11]  Mike Thelwall,et al.  Sentiment in Twitter events , 2011, J. Assoc. Inf. Sci. Technol..

[12]  Hiroshi Kanayama,et al.  Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis , 2006, EMNLP.

[13]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[14]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[15]  Qingcai Chen,et al.  Fuzzy deep belief networks for semi-supervised sentiment classification , 2014, Neurocomputing.

[16]  Sarabjot S. Anand,et al.  Predicting the Polarity Strength of Adjectives Using WordNet , 2009, ICWSM.

[17]  Kyoungok Kim,et al.  Sentiment visualization and classification via semi-supervised nonlinear dimensionality reduction , 2014, Pattern Recognit..

[18]  Saif Mohammad,et al.  Generating High-Coverage Semantic Orientation Lexicons From Overtly Marked Words and a Thesaurus , 2009, EMNLP.

[19]  Mitsuru Ishizuka,et al.  SentiFul: A Lexicon for Sentiment Analysis , 2011, IEEE Transactions on Affective Computing.

[20]  Enrique Herrera-Viedma,et al.  On multi-granular fuzzy linguistic modeling in group decision making problems: A systematic review and future trends , 2015, Knowl. Based Syst..

[21]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[22]  Hua Xu,et al.  An empirical study of unsupervised sentiment classification of Chinese reviews , 2010 .

[23]  Min Xiao,et al.  Feature Space Independent Semi-Supervised Domain Adaptation via Kernel Matching , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Mitsuru Ishizuka,et al.  Affect Analysis Model: novel rule-based approach to affect sensing from text , 2010, Natural Language Engineering.

[25]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[26]  Yongfeng Huang,et al.  Chinese reviews sentiment classification based on quantified sentiment lexicon and fuzzy set , 2013, 2013 IEEE Third International Conference on Information Science and Technology (ICIST).

[27]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[28]  Daniel E. O'Leary,et al.  Blog mining-review and extensions: "From each according to his opinion" , 2011, Decis. Support Syst..

[29]  Hsin-Hsi Chen,et al.  Using Morphological and Syntactic Structures for Chinese Opinion Analysis , 2009, EMNLP.

[30]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[31]  Oi Yee Kwong,et al.  Morpheme-based Derivation of Bipolar Semantic Orientation of Chinese Words , 2004, COLING.

[32]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.