Twitter Sentiment Based Mining for Decision Making using Text Classifiers with Learning by Induction

The amount of data residing in social media currently untapped is certainly limitless as millions of people are constantly posting a message or the other to public forums on the internet. Twitter being one of the largest social media networks with over 336 million monthly active users has proven to be a fertile ground for harvesting opinion from multiple people. This work explores how opinion can be extracted from tweets to discover people's view concerning a certain subject matter. It focuses mainly on overcoming the limitation of the current approach to social media sentiment based mining for decision making which is that opinions derived from multiple sources are limited to available connections on the social media platforms and lack of improved accuracy of mined opinions. In order to achieve this, the proposed framework provides a platform to mine opinions from more than the available friends and connections on the social media platform and in addition, improve the quality of the opinion mined by implementing supervised learning algorithms with learning by induction in Twitter data analysis. In this research, three different supervised machine learning algorithms were applied to a dataset curated by graduate students at Stanford in order to accurately classify tweets into either positive or negative sentiment based on its content. It was discovered that Maximum Entropy had the highest accuracy of 83.5% among the three algorithms. The research has provided a web application which would enable users such as CEOs, Market Analysts, and random users make quality decision based on others' opinions.

[1]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Christian Kay,et al.  The Oxford English Dictionary Online , 2004, Lit. Linguistic Comput..

[4]  Bohdan M. Pavlyshenko Classification analysis of authorship fiction texts in the space of semantic fields , 2013, J. Quant. Linguistics.

[5]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[6]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[7]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[8]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[9]  Reza Zafarani,et al.  Social Media Mining: An Introduction , 2014 .

[10]  Stylianos Kampakis,et al.  Using Twitter to predict football outcomes , 2014, ArXiv.

[11]  Mohamad Ivan Fanany,et al.  Twitter Sentiment to Analyze Net Brand Reputation of Mobile Phone Providers , 2015 .

[12]  Jianfeng Guo,et al.  How Does Market Concern Derived from the Internet Affect Oil Prices? , 2013 .

[13]  I. Afolabi,et al.  Competitive analysis of social media data in the banking industry , 2017 .

[14]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[15]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[16]  Christine W. Chan,et al.  A data analysis decision support system for the carbon dioxide capture process , 2009, Expert Syst. Appl..