Correlating financial time series with micro-blogging activity

We study the problem of correlating micro-blogging activity with stock-market events, defined as changes in the price and traded volume of stocks. Specifically, we collect messages related to a number of companies, and we search for correlations between stock-market events for those companies and features extracted from the micro-blogging messages. The features we extract can be categorized in two groups. Features in the first group measure the overall activity in the micro-blogging platform, such as number of posts, number of re-posts, and so on. Features in the second group measure properties of an induced interaction graph, for instance, the number of connected components, statistics on the degree distribution, and other graph-based properties. We present detailed experimental results measuring the correlation of the stock market events with these features, using Twitter as a data source. Our results show that the most correlated features are the number of connected components and the number of nodes of the interaction graph. The correlation is stronger with the traded volume than with the price of the stock. However, by using a simulator we show that even relatively small correlations between price and micro-blogging features can be exploited to drive a stock trading strategy that outperforms other baseline strategies.

[1]  Balachander Krishnamurthy,et al.  A few chirps about twitter , 2008, WOSN '08.

[2]  H. Eugene Stanley,et al.  Trend Switching Processes in Financial Markets , 2010 .

[3]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[4]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[5]  Daniel Gayo-Avello,et al.  Don't turn social media into another 'Literary Digest' poll , 2011, Commun. ACM.

[6]  Hiroyuki Kitagawa,et al.  TURank: Twitter User Ranking Based on User-Tweet Graph Analysis , 2010, WISE.

[7]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[8]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[9]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[10]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[11]  John H. Gerdes,et al.  Using web-based search data to predict macroeconomic statistics , 2005, CACM.

[12]  Ravi Kumar,et al.  Structure and evolution of blogspace , 2004, CACM.

[13]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[14]  James Allan,et al.  Language models for financial news recommendation , 2000, CIKM '00.

[15]  Massimiliano Ciaramita,et al.  World knowledge in broad-coverage information filtering , 2007, SIGIR.

[16]  David M. Pennock,et al.  What Can Search Predict? , 2010 .

[17]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[18]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[19]  Philip S. Yu,et al.  Colibri: fast mining of large static and dynamic graphs , 2008, KDD.

[20]  Diane J. Cook,et al.  Monitoring Influenza Trends through Mining Social Media , 2009, BIOCOMP.

[21]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[22]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[23]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[24]  David D. Jensen,et al.  Mining of Concurrent Text and Time Series , 2008 .

[25]  Panagiotis Takis Metaxas,et al.  Limits of Electoral Predictions Using Twitter , 2011, ICWSM.

[26]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[27]  An-Sing Chen,et al.  Application of Neural Networks to an Emerging Financial Market: Forecasting and Trading the Taiwan Stock Index , 2001, Comput. Oper. Res..

[28]  Bernd Hayo,et al.  The Impact of News, Oil Prices, and Global Market Developments on Russian Financial Markets , 2004 .

[29]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[30]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[31]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[32]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.