Improving stock market prediction via heterogeneous information fusion

Traditional stock market prediction approaches commonly utilize the historical price-related data of the stocks to forecast their future trends. As the Web information grows, recently some works try to explore financial news to improve the prediction. Effective indicators, e.g., the events related to the stocks and the people's sentiments towards the market and stocks, have been proved to play important roles in the stocks' volatility, and are extracted to feed into the prediction models for improving the prediction accuracy. However, a major limitation of previous methods is that the indicators are obtained from only a single source whose reliability might be low, or from several data sources but their interactions and correlations among the multi-sourced data are largely ignored. In this work, we extract the events from Web news and the users' sentiments from social media, and investigate their joint impacts on the stock price movements via a coupled matrix and tensor factorization framework. Specifically, a tensor is firstly constructed to fuse heterogeneous data and capture the intrinsic relations among the events and the investors' sentiments. Due to the sparsity of the tensor, two auxiliary matrices, the stock quantitative feature matrix and the stock correlation matrix, are constructed and incorporated to assist the tensor decomposition. The intuition behind is that stocks that are highly correlated with each other tend to be affected by the same event. Thus, instead of conducting each stock prediction task separately and independently, we predict multiple correlated stocks simultaneously through their commonalities, which are enabled via sharing the collaboratively factorized low rank matrices between matrices and the tensor. Evaluations on the China A-share stock data and the HK stock data in the year 2015 demonstrate the effectiveness of the proposed model.

[1]  Yue Zhang,et al.  Knowledge-Driven Event Embedding for Stock Prediction , 2016, COLING.

[2]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[3]  Ling Liu,et al.  The effect of news and public mood on stock movements , 2014, Inf. Sci..

[4]  Sa-Kwang Song,et al.  Media-aware quantitative trading based on public Web information , 2014, Decis. Support Syst..

[5]  D. Helbing,et al.  Quantifying the Behavior of Stock Correlations Under Market Stress , 2012, Scientific Reports.

[6]  Jianping Zeng,et al.  Posterior probability model for stock return prediction based on analyst's recommendation behavior , 2013, Knowl. Based Syst..

[7]  Rebecca J. Passonneau,et al.  Semantic Frames to Predict Stock Price Movement , 2013, ACL.

[8]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[9]  Vadlamani Ravi,et al.  A survey of the applications of text mining in financial domain , 2016, Knowl. Based Syst..

[10]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[11]  J. Poterba,et al.  What moves stock prices? , 1988 .

[12]  Yue Zhang,et al.  Measuring the Information Content of Financial News , 2016, COLING.

[13]  Xi Zhang,et al.  Effective and Fast Near Duplicate Detection via Signature-Based Compression Metrics , 2016 .

[14]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[15]  Alexandre d'Aspremont,et al.  Predicting abnormal returns from news using text classification , 2008, 0809.2792.

[16]  Longbing Cao,et al.  Coupled nominal similarity in unsupervised learning , 2011, CIKM '11.

[17]  Yue Zhang,et al.  Using Structured Events to Predict Stock Price Movement: An Empirical Investigation , 2014, EMNLP.

[18]  Hsinchun Chen,et al.  A Tensor-Based Information Framework for Predicting the Stock Market , 2016, ACM Trans. Inf. Syst..

[19]  Chin-Kun Hu,et al.  Stochastic dynamical model for stock-stock correlations. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Xiaoming Zhang,et al.  Computing Urban Traffic Congestions by Incorporating Sparse GPS Probe Data and Social Media Data , 2017, ACM Trans. Inf. Syst..

[21]  M. Günther,et al.  Modelling stochastic correlation , 2014 .

[22]  Xiaotie Deng,et al.  Exploiting Topic based Twitter Sentiment for Stock Prediction , 2013, ACL.

[23]  Fangfang Li,et al.  Coupled Matrix Factorization Within Non-IID Context , 2014, PAKDD.

[24]  Zhi Xiao,et al.  A multiple support vector machine approach to stock index forecasting with mixed frequency sampling , 2017, Knowl. Based Syst..

[25]  Nuria Oliver,et al.  Multiverse recommendation: n-dimensional tensor factorization for context-aware collaborative filtering , 2010, RecSys '10.

[26]  William Yang Wang,et al.  A Semiparametric Gaussian Copula Regression Model for Predicting Financial Risks from Earnings Calls , 2014, ACL.

[27]  Yue Zhang,et al.  Deep Learning for Event-Driven Stock Prediction , 2015, IJCAI.

[28]  Xiaolong Wang,et al.  A novel text mining approach to financial time series forecasting , 2012, Neurocomputing.

[29]  Wen-I Chuang,et al.  An empirical evaluation of the overconfidence hypothesis , 2006 .

[30]  Hsinchun Chen,et al.  Tensor-Based Learning for Predicting Stock Movements , 2015, AAAI.

[31]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[32]  Kiyoaki Shirai,et al.  Topic Modeling based Sentiment Analysis on Social Media for Stock Market Prediction , 2015, ACL.

[33]  Ronen Feldman,et al.  The Stock Sonar - Sentiment Analysis of Stocks Based on a Hybrid Approach , 2011, IAAI.

[34]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[35]  John R. Nofsinger Social Mood and Financial Economics , 2005 .

[36]  Kazuhiro Seki,et al.  Predicting Stock Market Trends by Recurrent Deep Neural Networks , 2014, PRICAI.

[37]  Sander Sepp,et al.  Modeling of stock return correlation , 2011 .

[38]  Qing Li,et al.  Exploiting Social Relations and Sentiment for Stock Prediction , 2014, EMNLP.

[39]  Aristides Gionis,et al.  Correlating financial time series with micro-blogging activity , 2012, WSDM '12.

[40]  Zhoujun Li,et al.  Estimating Urban Traffic Congestions with Multi-sourced Data , 2016, 2016 17th IEEE International Conference on Mobile Data Management (MDM).

[41]  Hsin-Hsi Chen,et al.  Mining opinions from the Web: Beyond relevance retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[42]  Ying Chen,et al.  Associating stock prices with web financial information time series based on support vector regression , 2013, Neurocomputing.

[43]  Richard Deaves,et al.  Behavioral Finance: Psychology, Decision-Making, and Markets , 2009 .

[44]  Hsinchun Chen,et al.  A quantitative stock prediction system based on financial news , 2009, Inf. Process. Manag..

[45]  Hui Jiang,et al.  Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks , 2015, NAACL.

[46]  Li Chen,et al.  News impact on stock price return via sentiment analysis , 2014, Knowl. Based Syst..

[47]  Zhoujun Li,et al.  Citywide traffic congestion estimation with social media , 2015, SIGSPATIAL/GIS.

[48]  P. Gloor,et al.  Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear” , 2011 .

[49]  E. Fama The Behavior of Stock-Market Prices , 1965 .

[50]  Harry Eugene Stanley,et al.  Confidence and the Stock Market: An Agent-Based Approach , 2014, PloS one.