Sentiment Classification of Twitter Data Belonging to Saudi Arabian Telecommunication Companies

Twitter has attracted the attention of many re-searchers owing to the fact that every tweet is, by default, public in nature which is not the case with Facebook. In this paper, we present sentiment analysis of tweets written in English, belonging to different telecommunication companies in Saudi Arabia. We apply different machine learning algorithms such as k nearest neighbor algorithm, Artificial Neural Networks (ANN), Na¨ive Bayesian etc. We classified the tweets into positive, negative and neutral classes based on Euclidean distance as well as cosine similarity. Moreover, we also learned similarity matrices for kNN classification. CfsSubsetEvaluation as well as Information Gain was used for feature selection. The results of CfsSubsetEvaluation were better than the ones obtained with Information Gain. Moreover, kNN performed better than the other algorithms and gave 75.4%, 76.6% and 75.6% for Precision, Recall and F-measure, respectively. We were able to get an accuracy of 80.1%with a symmetric variant of kNN while using cosine similarity. Furthermore, interesting trends wrt days, months etc. were also discovered.

[1]  Joo-Hwee Lim,et al.  Similarity Learning for Nearest Neighbor Classification , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[2]  Axel Bruns,et al.  Twitter archives and the challenges of "Big Social Data" for media and communication research , 2012 .

[3]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[4]  Hafiz Syed Muhammad Bilal,et al.  Prediction and analysis of Pakistan election 2013 based on sentiment analysis , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[5]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[6]  Hakan Ferhatosmanoglu,et al.  Short text classification in twitter to improve information filtering , 2010, SIGIR.

[7]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[8]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[9]  Alexandra I. Cristea,et al.  Twitter Analysis to Predict the Satisfaction of Telecom Company Customers , 2016, HT.

[10]  Zhenyu Yang,et al.  Sentiment analysis on tweets for social events , 2013, Proceedings of the 2013 IEEE 17th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[11]  Stefan Stieglitz,et al.  Political Communication and Influence through Microblogging--An Empirical Analysis of Sentiment in Twitter Messages and Retweet Behavior , 2012, 2012 45th Hawaii International Conference on System Sciences.

[12]  Ruppa K. Thulasiram,et al.  Twitter sentiment classification using machine learning techniques for stock markets , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[13]  Fabio Crestani,et al.  Like It or Not , 2016, ACM Comput. Surv..

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[15]  Usman Qamar,et al.  SentiView: A visual sentiment analysis framework , 2014, International Conference on Information Society (i-Society 2014).

[16]  David Zimbra,et al.  Brand-Related Twitter Sentiment Analysis Using Feature Engineering and the Dynamic Architecture for Artificial Neural Networks , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).

[17]  Muhammad Asif Razzaq,et al.  Prediction of popular tweets using Similarity Learning , 2013, 2013 IEEE 9th International Conference on Emerging Technologies (ICET).

[18]  Brian D. Davison,et al.  Predicting popular messages in Twitter , 2011, WWW.