On the impact of publicly available news and information transfer to financial markets

We quantify the propagation and absorption of large-scale publicly available news articles from the World Wide Web to financial markets. To extract publicly available information, we use the news archives from the Common Crawl, a non-profit organization that crawls a large part of the web. We develop a processing pipeline to identify news articles associated with the constituent companies in the S&P 500 index, an equity market index that measures the stock performance of US companies. Using machine learning techniques, we extract sentiment scores from the Common Crawl News data and employ tools from information theory to quantify the information transfer from public news articles to the US stock market. Furthermore, we analyse and quantify the economic significance of the news-based information with a simple sentiment-based portfolio trading strategy. Our findings provide support for that information in publicly available news on the World Wide Web has a statistically and economically significant impact on events in financial markets.

[1]  Lubomir T. Chitkushev,et al.  Evaluation of Sentiment Analysis in Finance: From Lexicons to Transformers , 2020, IEEE Access.

[2]  Tom B. Brown,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[3]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[4]  Dayong Zhang,et al.  Social-media and intraday stock returns: The pricing power of sentiment , 2019, Finance Research Letters.

[5]  Dogu Araci,et al.  FinBERT: Financial Sentiment Analysis with Pre-trained Language Models , 2019, ArXiv.

[6]  B. Kelly,et al.  Predicting Returns with Text Data , 2019, SSRN Electronic Journal.

[7]  Dhajvir Singh Rai,et al.  Sentiment Analysis and Stock Market Prediction-Using news to predict stock markets , 2019 .

[8]  Thomas Dimpfl,et al.  RTransferEntropy - Quantifying information flow between different time series using effective transfer entropy , 2019, SoftwareX.

[9]  Zili Zhang,et al.  Construction of Financial News Sentiment Indices Using Deep Neural Networks , 2019, SSRN Electronic Journal.

[10]  Chung-Kang Peng,et al.  Causal decomposition in the mutual causation system , 2017, Nature Communications.

[11]  Alexander Herzog,et al.  Representativeness of latent dirichlet allocation topics estimated from data samples with application to common crawl , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[12]  Steve Y. Yang,et al.  Genetic programming optimization for a sentiment feedback strength based trading strategy , 2017, Neurocomputing.

[13]  Muhammad Amir Mehmood,et al.  Understanding regional context of World Wide Web using common crawl corpus , 2017, 2017 IEEE 13th Malaysia International Conference on Communications (MICC).

[14]  M. Mäntylä,et al.  The evolution of sentiment analysis - A review of research topics, venues, and top cited papers , 2016, Comput. Sci. Rev..

[15]  Jérôme Kunegis,et al.  On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl , 2016, J. Web Sci..

[16]  H. Eugene Stanley,et al.  COUPLED NETWORK APPROACH TO PREDICTABILITY OF FINANCIAL MARKET RETURNS AND NEWS SENTIMENTS , 2015 .

[17]  Raphael H. Heiberger,et al.  Collective Attention and Stock Prices: Evidence from Google Trends Data on Standard and Poor's 100 , 2015, PloS one.

[18]  Yue Zhang,et al.  Deep Learning for Event-Driven Stock Prediction , 2015, IJCAI.

[19]  Júlio Cesar dos Reis,et al.  Breaking the News: First Impressions Matter on Online News , 2015, ICWSM.

[20]  Guido Caldarelli,et al.  Coupling News Sentiment with Web Browsing Data Improves Prediction of Intra-Day Price Dynamics , 2014, PloS one.

[21]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[22]  Nicolas Kourtellis,et al.  Stock trade volume prediction with Yahoo Finance user browsing behavior , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[23]  Xiong Xiong,et al.  Internet information arrival and volatility of SME PRICE INDEX , 2014 .

[24]  Tomaso Aste,et al.  When Can Social Media Lead Financial Markets? , 2014, Scientific Reports.

[25]  Petra Kralj Novak,et al.  News Cohesiveness: an Indicator of Systemic Risk in Financial Markets , 2014, ArXiv.

[26]  Tobias Preis,et al.  Quantifying the Relationship Between Financial News and the Stock Market , 2013, Scientific Reports.

[27]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[28]  Chuan-Ju Wang,et al.  Financial Sentiment Analysis for Risk Prediction , 2013, IJCNLP.

[29]  Xiong Xiong,et al.  Open source information, investor attention, and asset pricing , 2013 .

[30]  H. Stanley,et al.  Quantifying Trading Behavior in Financial Markets Using Google Trends , 2013, Scientific Reports.

[31]  Dimitris Kugiumtzis,et al.  Partial transfer entropy on rank vectors , 2013, ArXiv.

[32]  T. Dimpfl,et al.  Using transfer entropy to measure information flows between financial markets , 2013 .

[33]  Zou Ping,et al.  Sentiment analysis: A literature review , 2012, 2012 International Symposium on Management of Technology (ISMOT).

[34]  George Sugihara,et al.  Detecting Causality in Complex Ecosystems , 2012, Science.

[35]  D. Sornette,et al.  High Quality Topic Extraction from Business News Explains Abnormal Financial Market Volatility , 2012, PloS one.

[36]  T. Rao,et al.  Analyzing Stock Market Movements Using Twitter Sentiment Analysis , 2012, ASONAM 2012.

[37]  Adam V. Reed,et al.  How are Shorts Informed? Short Sellers, News, and Information Processing , 2012 .

[38]  Adam V. Reed,et al.  How are shorts informed , 2012 .

[39]  M. Tumminello,et al.  How news affects the trading behaviour of different categories of investors in a financial market , 2012, 1207.3300.

[40]  Raphael N. Markellos,et al.  Information Demand and Stock Market Volatility , 2012 .

[41]  Aristides Gionis,et al.  Correlating financial time series with micro-blogging activity , 2012, WSDM '12.

[42]  Johan Bollen,et al.  Predicting Financial Markets: Comparing Survey,News, Twitter and Search Engine Data , 2011, ArXiv.

[43]  Gene Birz,et al.  The effect of macroeconomic news on stock returns: New evidence from newspaper coverage , 2011 .

[44]  H. Kleinert,et al.  Rényi’s information transfer between financial time series , 2011, 1106.5913.

[45]  Nikolaus Hautsch,et al.  When machines read the news: Using automated text analytics to quantify high frequency news-implied market reactions , 2011 .

[46]  Vasily A. Vakorin,et al.  Confounding effects of indirect connections on causality estimation , 2009, Journal of Neuroscience Methods.

[47]  A. Seth,et al.  Granger causality and transfer entropy are equivalent for Gaussian variables. , 2009, Physical review letters.

[48]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[49]  Fulvio Corsi,et al.  A Simple Approximate Long-Memory Model of Realized Volatility , 2008 .

[50]  Alexandre d'Aspremont,et al.  Predicting abnormal returns from news using text classification , 2008, 0809.2792.

[51]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.

[52]  Rosario N. Mantegna,et al.  Introduction to Econophysics , 2007 .

[53]  K. Hlavácková-Schindler,et al.  Causality detection based on information-theoretic approaches in time series analysis , 2007 .

[54]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[55]  P. F. Verdes Assessing causality from multivariate time series. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[56]  Clara Vega Stock Price Reaction to Public and Private Information , 2004 .

[57]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[58]  Thomas H. McCurdy,et al.  News Arrival, Jump Dynamics and Volatility Components for Individual Stock Returns , 2003 .

[59]  H. Kantz,et al.  Analysing the information flow between financial time series , 2002 .

[60]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[61]  W. S. Chan,et al.  Stock Price Reaction to News and No-News: Drift and Reversal after Headlines , 2001 .

[62]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[63]  M. Dacorogna,et al.  Volatilities of different time resolutions — Analyzing the dynamics of market components , 1997 .

[64]  James G. MacKinnon,et al.  Approximate Asymptotic Distribution Functions for Unit-Root and Cointegration Tests , 1994 .

[65]  James G. MacKinnon,et al.  Critical Values for Cointegration Tests , 1990 .

[66]  Philippe Jorion On Jump Processes in the Foreign Exchange and Stock Markets , 1988 .

[67]  John C. Fellingham,et al.  An Equilibrium Model of Asset Trading with Sequential Information Arrival , 1981 .

[68]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[69]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[70]  Babar Hayat,et al.  Sentiment Analysis Using Deep Learning Techniques: A Review , 2017, International Journal of Advanced Computer Science and Applications.

[71]  I. Morrison,et al.  Design of a Virtual Player for Joint Improvisation with Humans in the Mirror Game , 2016, bioRxiv.

[72]  Vipin Chaudhary,et al.  Big Data in Finance , 2016 .

[73]  Geoffrey E. Hinton,et al.  Deep Learning , 2015 .

[74]  C.J.H. Mann CS – 1 : Complex Systems , 2013 .

[75]  Juana María Ruiz-Martínez,et al.  Semantic-Based Sentiment analysis in financial news , 2012 .

[76]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2009 .

[77]  Paul C. Tetlock Giving Content to Investor Sentiment: The Role of Media in the Stock Market , 2005, The Journal of Finance.

[78]  Philip Protter,et al.  A short history of stochastic integration and mathematical finance the early years, 1880-1970 , 2004 .

[79]  A. Dasgupta A Festschrift for Herman Rubin , 2004 .

[80]  R. Mantegna,et al.  An Introduction to Econophysics: Contents , 1999 .

[81]  Douglas Gale,et al.  Efficient Capital Markets : A Review of Theory and Empirical Work , 1994 .

[82]  P. Clark A Subordinated Stochastic Process Model with Finite Variance for Speculative Prices , 1973 .

[83]  E. Fama,et al.  Efficient market hypothesis: A Review of Theory and Empirical Work , 1970 .

[84]  B. Mandlebrot The Variation of Certain Speculative Prices , 1963 .

[85]  L. Bachelier,et al.  Théorie de la spéculation , 1900 .