Quantifying News Narratives to Predict Movements in Market Risk

The theory of Narrative Economics suggests that narratives present in media influence market participants and drive economic events. In this chapter, we investigate how financial news narratives relate to movements in the CBOE Volatility Index. To this end, we first introduce an uncharted dataset where news articles are described by a set of financial keywords. We then perform topic modeling to extract news themes, comparing the canonical latent Dirichlet analysis to a technique combining doc2vec and Gaussian mixture models. Finally, using the state-of-the-art XGBoost (Extreme Gradient Boosted Trees) machine learning algorithm, we show that the obtained news features outperform a simple baseline when predicting CBOE Volatility Index movements on different time horizons.

[1]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[2]  Václav Hlavác,et al.  Expectation Maximization Algorithm , 2014, Computer Vision, A Reference Guide.

[3]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[4]  Douglas A. Reynolds,et al.  Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.

[5]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[6]  Adam Atkins,et al.  Financial news predicts stock market volatility better than close price , 2018, The Journal of Finance and Data Science.

[7]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[8]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[9]  Stefan Feuerriegel,et al.  Long-term stock index forecasting based on text mining of regulatory disclosures , 2018, Decis. Support Syst..

[10]  Takashi Matsubara,et al.  Deep learning for stock prediction using numerical and textual information , 2016, 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS).

[11]  Mark Stevenson,et al.  Evaluating Topic Coherence Using Distributional Semantics , 2013, IWCS.

[12]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[15]  Manuel R. Vargas,et al.  Deep learning for stock market prediction from financial news articles , 2017, 2017 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA).

[16]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[17]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18]  R. L. Thorndike Who belongs in the family? , 1953 .

[19]  Ling Ma,et al.  Deep learning models for bankruptcy prediction using textual disclosures , 2019, Eur. J. Oper. Res..

[20]  Stefan Feuerriegel,et al.  Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation , 2018, Decis. Sci..

[21]  R. Shiller,et al.  Narrative economics: how stories go viral and drive major economic events , 2021, The European Journal of the History of Economic Thought.

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[24]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[25]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.