Correlating Twitter with the stock market through non-Gaussian SVAR

In this paper, we aim at studying the correlation between Twitter and the stock market. Specifically, we first apply non-Gaussian SVAR (structural vector autoregression) to identify possible relationships among the Twitter and stock market factors. Compared with conventional models such as Granger causality method which assume that the error items are Gaussian and only consider time-lag effect, non-Gaussian SVAR is under the assumption that the error items are non-Gaussian, better fitting the data in the stock market, and takes both instantaneous and time-lagged effects into account. We also visualize some distinctive relationships in parallel coordinates which is a well-developed multivariate visualization technique but seldom used in financial studies to the best of knowledge. Then, with the purpose of examining whether the Twitter-stock market relationship returned by non-Gaussian SVAR can help predict the stock market indicators, we build a series of regression models to predict DJI (Dow Jones Industrial Average Index) return in a sliding time window. Our experiments demonstrate that all the Twitter factors correlate with DJI return, and only the negative sentiment in tweets (posts on Twitter) is associated with DJI return volatility. Moreover, the lagged Twitter factors are more effective than the lagged stock market indicators in terms of predicting DJI return in the period of our data set.

[1]  Erkki Oja,et al.  Independent component analysis by general nonlinear Hebbian-like learning rules , 1998, Signal Process..

[2]  Deborah A. Small,et al.  Heart Strings and Purse Strings , 2004, Psychological science.

[3]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[4]  Haim Levkowitz,et al.  Uncovering Clusters in Crowded Parallel Coordinates Visualizations , 2004, IEEE Symposium on Information Visualization.

[5]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[6]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[7]  Xiaotie Deng,et al.  Exploiting Topic based Twitter Sentiment for Stock Prediction , 2013, ACL.

[8]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[9]  Alex Edmans,et al.  Sports Sentiment and Stock Returns , 2006 .

[10]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[11]  Y. Hanoch "Neither an Angel nor an Ant": Emotion as an Aid to Bounded Rationality , 2002 .

[12]  Leland Gerson Neuberg,et al.  CAUSALITY: MODELS, REASONING, AND INFERENCE, by Judea Pearl, Cambridge University Press, 2000 , 2003, Econometric Theory.

[13]  Alfred Inselberg,et al.  Parallel Coordinates: Visual Multidimensional Geometry and Its Applications , 2003, KDIR.

[14]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[15]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[16]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004, IEEE Symposium on Information Visualization.

[17]  Nada Lavrac,et al.  Predictive Sentiment Analysis of Tweets: A Stock Market Application , 2013, CHI-KDD.

[18]  J. Wei,et al.  Stock market returns: A note on temperature anomaly , 2005 .

[19]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[20]  Norman R. Swanson,et al.  Impulse Response Functions Based on a Causal Approach to Residual Orthogonalization in Vector Autoregressions , 1997 .

[21]  Eric Gilbert,et al.  Widespread Worry and the Stock Market , 2010, ICWSM.

[22]  Jean-François Cardoso,et al.  Equivariant adaptive source separation , 1996, IEEE Trans. Signal Process..

[23]  Ray Chen,et al.  Analysis of Twitter Feeds for the Prediction of Stock Market Movement , 2011 .

[24]  Joos Vandewalle,et al.  Visualizing high dimensional datasets using parallel coordinates: Application to gene prioritization , 2012, 2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE).

[25]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[26]  Paul C. Tetlock Giving Content to Investor Sentiment: The Role of Media in the Stock Market , 2005, The Journal of Finance.

[27]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[28]  Lisa A. Kramer,et al.  Winter Blues: A Sad Stock Market Cycle , 2003 .

[29]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[30]  H. Levy,et al.  Sentiment and Stock Prices: The Case of Aviation Disasters , 2008 .

[31]  Aapo Hyvärinen,et al.  Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity , 2010, J. Mach. Learn. Res..

[32]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[33]  Harri Siirtola Direct manipulation of parallel coordinates , 2000, CHI Extended Abstracts.

[34]  P. Gloor,et al.  Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear” , 2011 .

[35]  M. Cooper,et al.  Revealing structure within clustered parallel coordinates displays , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[36]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[37]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[38]  Ronen Feldman,et al.  Identifying and Following Expert Investors in Stock Microblogs , 2011, EMNLP.

[39]  Ping Guo,et al.  Visual Analysis of the Air Pollution Problem in Hong Kong , 2007, IEEE Transactions on Visualization and Computer Graphics.

[40]  Aristides Gionis,et al.  Correlating financial time series with micro-blogging activity , 2012, WSDM '12.

[41]  Rajnish Mehra,et al.  Mood fluctuations, projection bias, and volatility of equity prices , 2002 .

[42]  Olivia Sheng,et al.  Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement , 2011, ICIS.

[43]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[44]  Maik Schmeling Investor sentiment and stock returns: Some international evidence , 2009 .

[45]  Tova Avidan,et al.  ParallAX– A data mining tool based on parallel coordinates , 1999, Comput. Stat..

[46]  M. Cooper,et al.  Visual data analysis using tracked statistical measures within parallel coordinate representations , 2005, Coordinated and Multiple Views in Exploratory Visualization (CMV'05).

[47]  Wei Wei,et al.  Correlating S&P 500 stocks with Twitter data , 2012, HotSocial '12.