Competitive Self-Training technique for sentiment analysis in mass social media

This paper aims to analyze user's emotion automatically by analyzing Twitter using "data without sentiment labels", not only "data with sentiment labels", to increase accuracy of sentiment analysis through an improved Self-Training, one of Semi-supervised learning techniques. Self-Training has a weak point that a classification mistake can reinforce itself. Self-Training iteratively modifies the model based on the output of the model. Thus, if the model generates wrong output, the model can be wrongly modified. For alleviate this weak point, we propose a competitive Self-Training technique. We create three models based on the output of the model and choose the best. Three models are created by binary mixture perspectives: the threshold, the same number, and the maximum number for updates. We repeat step that creating model and choosing a best model highest to get F-measure. Finally, we can improve the performance of sentiment analysis model.

[1]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[2]  Kyung Mi Lee,et al.  Statistical cluster validity indexes to consider cohesion and separation , 2012, 2012 International conference on Fuzzy Theory and Its Applications (iFUZZY2012).

[3]  Sungzoon Cho,et al.  Semi-Supervised Response Modeling , 2010 .

[4]  Sang-goo Lee,et al.  Product Review Data and Sentiment Analytical Processing Modeling , 2011 .

[5]  Jee-Hyong Lee,et al.  Semi-supervised learning for sentiment analysis in mass social media , 2014 .

[6]  Jee-Hyong Lee,et al.  Implementation of Ontology Based Context-Awareness Framework for Ubiquitous Environment , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[7]  Jee-Hyong Lee,et al.  A music recommendation system with a dynamic k-means clustering algorithm , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[8]  Uzay Kaymak,et al.  Exploiting emoticons in sentiment analysis , 2013, SAC '13.

[9]  Jae-Young Chang,et al.  An Efficient Search Method of Product Reviews using Opinion Mining Techniques , 2010 .

[10]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[11]  Mitsunori Matsushita,et al.  Relationship between Emotional Words and Emoticons in Tweets , 2012, 2012 Conference on Technologies and Applications of Artificial Intelligence.

[12]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[13]  In-Su Kang A Comparative Study on Using SentiWordNet for English Twitter Sentiment Analysis , 2013 .

[14]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.