Predictive Sentiment Analysis of Tweets: A Stock Market Application

The application addressed in this paper studies whether Twitter feeds, expressing public opinion concerning companies and their products, are a suitable data source for forecasting the movements in stock closing prices. We use the term predictive sentiment analysis to denote the approach in which sentiment analysis is used to predict the changes in the phenomenon of interest. In this paper, positive sentiment probability is proposed as a new indicator to be used in predictive sentiment analysis in finance. By using the Granger causality test we show that sentiment polarity (positive and negative sentiment) can indicate stock price movements a few days in advance. Finally, we adapted the Support Vector Machine classification mechanism to categorize tweets into three sentiment categories (positive, negative and neutral), resulting in improved predictive power of the classifier in the stock market application.

[1]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[2]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[5]  E. Fama Random Walks in Stock Market Prices , 1965 .

[6]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[9]  John R. Nofsinger Social Mood and Financial Economics , 2005 .

[10]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Eric Gilbert,et al.  Widespread Worry and the Stock Market , 2010, ICWSM.

[13]  Martin Žnidaršič,et al.  Sentiment analysis on tweets in a financial domain , 2012 .

[14]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[15]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[16]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[17]  C. Frith,et al.  DESCARTES ERROR - EMOTION, REASON AND THE HUMAN BRAIN - DAMASIO,AR , 1995 .

[18]  Manolis G. Kavussanos,et al.  A multivariate test for stock market efficiency: the case of ASE , 2001 .

[19]  M. Schervish P Values: What They are and What They are Not , 1996 .

[20]  Aristides Gionis,et al.  Correlating financial time series with micro-blogging activity , 2012, WSDM '12.

[21]  Mike Thelwall,et al.  Sentiment in Twitter events , 2011, J. Assoc. Inf. Sci. Technol..

[22]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[23]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[24]  Sean A. Spence,et al.  Descartes' Error: Emotion, Reason and the Human Brain , 1995 .

[25]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[26]  Guido Caldarelli,et al.  Web Search Queries Can Predict Stock Market Volumes , 2011, PloS one.