Social media data assisted inference with application to stock prediction

The access to the massive amount of social media data provides a unique opportunity to the signal processing community for extracting information that can be used to infer about unfolding events. It is desirable to investigate the convergence of sensor networks and social media in facilitating the data-to- decision making process and study how the two systems can complement each other for enhanced situational awareness. In this paper, we propose a copula-based joint characterization of multiple dependent time series from sensors and social media. As a proof-of-concept, this model is applied to the fusion of Google Trends (GT) data and stock price data of Apple Inc. for prediction, where the stock data serves as a surrogate for sensor data. Superior prediction performance is demonstrated, by taking the non-linear dependence among social media data and sensor data into consideration.

[1]  Xiaofeng Wang,et al.  Automatic Crime Prediction Using Events Extracted from Twitter Posts , 2012, SBP.

[2]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[3]  Berthold Schweizer,et al.  Probabilistic Metric Spaces , 2011 .

[4]  Ruey S. Tsay,et al.  Analysis of Financial Time Series , 2005 .

[5]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[6]  R. Nelsen An Introduction to Copulas , 1998 .

[7]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[8]  R. Engle Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation , 1982 .

[9]  Hao He,et al.  Fusing Censored Dependent Data for Distributed Detection , 2015, IEEE Transactions on Signal Processing.

[10]  Andrew J. Patton Copula Methods for Forecasting Multivariate Time Series , 2013 .

[11]  Tarek F. Abdelzaher,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, International Symposium on Information Processing in Sensor Networks.

[12]  Satish G. Iyengar,et al.  Decision-Making with Heterogeneous Sensors - A Copula Based Approach , 2011 .

[13]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[14]  Theresa L. Utlaut,et al.  Introduction to Time Series Analysis and Forecasting , 2008 .

[15]  T. Bollerslev,et al.  Generalized autoregressive conditional heteroskedasticity , 1986 .

[16]  Ruey S. Tsay,et al.  Analysis of Financial Time Series: Tsay/Analysis of Financial Time Series , 2005 .