Sentiment analysis of top colleges in India using Twitter data

In today's world, opinions and reviews accessible to us are one of the most critical factors in formulating our views and influencing the success of a brand, product or service. With the advent and growth of social media in the world, stakeholders often take to expressing their opinions on popular social media, namely Twitter. While Twitter data is extremely informative, it presents a challenge for analysis because of its humongous and disorganized nature. This paper is a thorough effort to dive into the novel domain of performing sentiment analysis of people's opinions regarding top colleges in India. Besides taking additional preprocessing measures like the expansion of net lingo and removal of duplicate tweets, a probabilistic model based on Bayes' theorem was used for spelling correction, which is overlooked in other research studies. This paper also highlights a comparison between the results obtained by exploiting the following machine learning algorithms: Naïve Bayes and Support Vector Machine and an Artificial Neural Network model: Multilayer Perceptron. Furthermore, a contrast has been presented between four different kernels of SVM: RBF, linear, polynomial and sigmoid.

[1]  S. Koteeswaran,et al.  DECISION TREE BASED FEATURE SELECTION AND MULTILAYER PERCEPTRON FOR SENTIMENT ANALYSIS , 2015 .

[2]  Diego Alejandro Salazar,et al.  Comparison between SVM and Logistic Regression: Which One is Better to Discriminate? , 2012 .

[3]  Andreas Dengel,et al.  Sentiment Analysis and Summarization of Twitter Data , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[4]  Hai Dong,et al.  Twitter Sentiment Mining: A Multi Domain Analysis , 2013, 2013 Seventh International Conference on Complex, Intelligent, and Software Intensive Systems.

[5]  Antony J. Williams,et al.  Beautiful Data: The Stories Behind Elegant Data Solutions , 2009 .

[6]  Kin Fun Li,et al.  Consumers' Sentiment Analysis of Popular Phone Brands and Operating System Preference Using Twitter Data: A Feasibility Study , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications.

[7]  R. Sangeetha,et al.  A Comparative Study and Choice of an Appropriate Kernel for Support Vector Machines , 2010, ICT.

[8]  S. Nickolas,et al.  Feature Selection Using Decision Tree Induction in Class level Metrics Dataset for Software Defect Predictions , 2010 .

[9]  Loc Nguyen,et al.  Tutorial on Support Vector Machine , 2016 .

[10]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[11]  Tan Yee Fan,et al.  A Tutorial on Support Vector Machine , 2009 .

[12]  Wan-Sup Cho,et al.  Voice of Customer Analysis for Internet Shopping Malls , 2013 .

[13]  Victoria D. Bush,et al.  What We Know and Don't Know about Online Word-of-Mouth: A Review and Synthesis of the Literature , 2014 .

[14]  Khalid Aa Abakar,et al.  Performance of SVM based on PUK kernel in comparison to SVM based on RBF kernel in prediction of yarn tenacity , 2014 .

[15]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[16]  Yanjun Qi,et al.  Sentiment classification based on supervised latent n-gram analysis , 2011, CIKM '11.

[17]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[18]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[19]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[20]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[21]  R. Rajasree,et al.  Sentiment analysis in twitter using machine learning techniques , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[22]  Mirna Adriani,et al.  A Comparative Study on Twitter Sentiment Analysis: Which Features are Good? , 2015, NLDB.

[23]  Ram Mohana Reddy Guddeti,et al.  Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[24]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[25]  Kurt Hornik,et al.  Support Vector Machines in R , 2006 .