Hybrid approach framework for sentiment classification on microblogging

Microblogging is used widely to express opinions toward an entity, knowing sentiment polarity can get benefit for decision making, planning, and visualization and so on. Outdated training data along with the nature of Microblogging which is short and noisy cause low accuracy. Existing approach requires human effort to manually label huge training data. To tackle these problems, we proposed a framework that used a hybrid approach between lexicon-based approach and machine learning approach. SentiWordnet has been used to automatically label training data and then using Support Vector Machine for sentiment classification. We study two scoring mechanisms for labeling training data, Word Sense Disambiguation and Non Word Sense Disambiguation. The framework also used MapReduce for computing large dataset. The result shows that Non Word Sense Disambiguation is optimal for this framework. The framework is functional, more automatically and less human efforts.

[1]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[2]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[3]  Choochart Haruechaiyasak,et al.  Discovering Consumer Insight from Twitter via Sentiment Analysis , 2012, J. Univers. Comput. Sci..

[4]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[5]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[6]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[7]  Patricio Martínez-Barco,et al.  Using EmotiBlog to annotate and analyse subjectivity in the new textual genres , 2012, Data Mining and Knowledge Discovery.

[8]  Raymond K. Wong,et al.  A System of Systems Service Design for Social Media Analytics , 2014, 2014 IEEE International Conference on Services Computing.

[9]  Baharum Baharudin,et al.  Sentiment classification using sentence-level semantic orientation of opinion terms from blogs , 2011, 2011 National Postgraduate Conference.

[10]  Mitsuru Ishizuka,et al.  Textual Affect Sensing for Sociable and Expressive Online Communication , 2007, ACII.

[11]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[12]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[13]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[14]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[15]  Jong C. Park,et al.  Identifying helpful reviews based on customer's mentions about experiences , 2012, Expert Syst. Appl..

[16]  Hua Xu,et al.  Weakness Finder: Find product weakness from Chinese reviews by using aspects based sentiment analysis , 2012, Expert Syst. Appl..

[17]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.