Prediction of Indian election using sentiment analysis on Hindi Twitter

Sentiment analysis is considered to be a category of machine learning and natural language processing. It is used to extricate, recognize, or portray opinions from different content structures, including news, audits and articles and categorizes them as positive, neutral and negative. It is difficult to predict election results from tweets in different Indian languages. We used Twitter Archiver tool to get tweets in Hindi language. We performed data (text) mining on 42,235 tweets collected over a period of a month that referenced five national political parties in India, during the campaigning period for general state elections in 2016. We made use of both supervised and unsupervised approaches. We utilized Dictionary Based, Naive Bayes and SVM algorithm to build our classifier and classified the test data as positive, negative and neutral. We identified the sentiment of Twitter users towards each of the considered Indian political parties. The results of the analysis for Naive Bayes was the BJP (Bhartiya Janta Party), for SVM it was the BJP (Bhartiya Janta Party) and for the Dictionary Approach it was the Indian Nathional Congress. SVM predicted a 78.4% chance that the BJP would win more elections in the general election due to the positive sentiment they received in tweets. As it turned out, BJP won 60 out of 126 constituencies in the 2016 general election, far more than any other political party as the next party (the Indian National Congress) only won 26 out of 126 constituencies.

[1]  Dipankar Das,et al.  Labeling Emotion in Bengali Blog Corpus – A Fine Grained Tagging at Sentence Level , 2010 .

[2]  Vasudeva Varma,et al.  Towards Enhanced Opinion Classification using NLP Techniques. , 2011 .

[3]  Sivaji Bandyopadhyay,et al.  SentiWordNet for Indian Languages , 2010 .

[4]  Pushpak Bhattacharyya,et al.  A Fall-back Strategy for Sentiment Analysis in Hindi: a Case Study , 2010 .

[5]  Veenu Mangat,et al.  A practical approach to Sentiment Analysis of hindi tweets , 2015, 2015 1st International Conference on Next Generation Computing Technologies (NGCT).

[6]  Pushpak Bhattacharyya,et al.  Verbs are where all the action lies: Experiences of Shallow Parsing of a Morphologically Rich Language , 2010, COLING.

[7]  Shrikanth S. Narayanan,et al.  A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle , 2012, ACL.

[8]  Suresh Kumar,et al.  Modified Non-Recursive Algorithm for Reconstructing a Binary Tree , 2012 .

[9]  Sambhav Jain,et al.  Two Methods to Incorporate ’Local Morphosyntactic’ Features in Hindi Dependency Parsing , 2010, SPMRL@NAACL-HLT.

[10]  Pushpak Bhattacharyya,et al.  Sentiment Analysis in Twitter with Lightweight Discourse Analysis , 2012, COLING.

[11]  Namita Mittal,et al.  Sentiment Analysis of Hindi Reviews based on Negation and Discourse Relation , 2013 .

[12]  Alan F. Smeaton,et al.  Classifying sentiment in microblogs: is brevity an advantage? , 2010, CIKM.

[13]  Omaima Almatrafi,et al.  Application of location-based sentiment analysis using Twitter for identifying trends towards Indian general elections 2014 , 2015, IMCOM.

[14]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.