The effect of preprocessing techniques on Twitter sentiment analysis

As Twitter offers a fertile ground for expressing different thoughts and opinions, it can be seen as a valuable tool for sentiment analysis. Furthermore, properly identified reviews present a baseline of information as an input to different systems, such as e-learning systems, decision support systems etc. However, the data preprocessing is a crucial step in sentiment analysis, since selecting the appropriate preprocessing methods, the correctly classified instances can be increased. In view of the above, this research paper explains the necessary information to get preprocess the reviews in order to find sentiment and make analysis whether it is positive or negative. Extended comparison of sentiment polarity classification methods for Twitter text and the role of text preprocessing in sentiment analysis are discussed in depth. In the set of tests, possible combinations of methods and report on their efficiency were included, conducting experiments using manually annotated Twitter datasets. Finally, it is proved that feature selection and representation can affect the classification performance positively.

[1]  Maria Virvou,et al.  Using Visualization Algorithms for Discovering Patterns in Groups of Users for Tutoring Multiple Languages through Social Networking , 2016, J. Networks.

[2]  Juan M. Corchado,et al.  A polarity analysis framework for Twitter messages , 2015, Appl. Math. Comput..

[3]  Mohd Abdul Hameed,et al.  Supervised Opinion Mining of Social Network Data Using a Bag-of-Words Approach on the Cloud , 2012, BIC-TA.

[4]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[5]  Maria Virvou,et al.  Comparative analysis of algorithms for student characteristics classification using a methodological framework , 2015, 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA).

[6]  Yong Shi,et al.  The Role of Text Pre-processing in Sentiment Analysis , 2013, ITQM.

[7]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[8]  Roberto V. Zicari,et al.  PoliTwi: Early detection of emerging political topics on twitter and the impact on concept-level sentiment analysis , 2014, Knowl. Based Syst..

[9]  Harith Alani,et al.  Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold , 2013, ESSEM@AI*IA.

[10]  Usman Qamar,et al.  TOM: Twitter opinion mining framework using hybrid classification scheme , 2014, Decis. Support Syst..

[11]  Nada Lavrac,et al.  Stream-based active learning for sentiment analysis in the financial domain , 2014, Inf. Sci..

[12]  Konstantinos Tserpes,et al.  Comparing Methods for Twitter Sentiment Analysis , 2014, KDIR.

[13]  Alexander F. Gelbukh,et al.  Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets , 2012, MICAI.

[14]  David A. Shamma,et al.  Tweet the debates: understanding community annotation of uncollected sources , 2009, WSM@MM.