Text sentiment analysis is a technology of high practical value and has been widely applied in spam filtering, recommendation system and automatic text summarization. This paper presents a text classification method based on extended emotional lexicon for microblog. With the help of the Sina Weibo official API, we extract the comments under the hot topics. By means of the existing emotional lexicon and the extended microblog emoticons lexicon, the cyberwords lexicon, and the interjection lexicon etc., taking into account the negation rules, the modification of the degree words, the effect of sentence patterns and so on, we design a contrast experiment of six groups and figure out the promoting effect of the various impact factors on the text sentiment classification accuracy. In addition, this paper puts forward the detailed computational formula to analyze the emotional intensity. By means of existing lexicons and extended lexicons, considering the variety of impact factors, experimental results from the microblog comments emotional polarity evaluator (MCEPE) developed in C++ show that the classification accuracy can reach up to 80% or more.
[1]
Jonathon Read,et al.
Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification
,
2005,
ACL.
[2]
Patrick Paroubek,et al.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
,
2010,
LREC.
[3]
Bing Liu,et al.
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data
,
2006,
Data-Centric Systems and Applications.
[4]
Bing Qin,et al.
Sentiment Analysis: Sentiment Analysis
,
2010
.
[5]
Ian Witten,et al.
Data Mining
,
2000
.
[6]
Elena Tsiporkova,et al.
Extracting emotions out of twitter's microblogs
,
2011
.
[7]
Liu Lu,et al.
Empirical study of sentiment classification for Chinese microblog based on machine learning
,
2012
.
[8]
Bo Pang,et al.
Thumbs up? Sentiment Classification using Machine Learning Techniques
,
2002,
EMNLP.