Deep Learning for Assessing Banks’ Distress from News and Numerical Financial Data

In this paper we focus our attention on the exploitation of the information contained in financial news to enhance the performance of a classifier of bank distress. Such information should be analyzed and inserted into the predictive model in the most efficient way and this task deals with the issues related to text analysis and specifically to the analysis of news media. Among the different models proposed for such purpose, we investigate one of the possible deep learning approaches, based on a doc2vec representation of the textual data, a kind of neural network able to map the sequence of words contained within a text onto a reduced latent semantic space. Afterwards, a second supervised neural network is trained combining news data with standard financial figures to classify banks whether in distressed or tranquil states. Indeed, the final aim is not only the improvement of the predictive performance of the classifier but also to assess the importance of news data in the classification process. Does news data really bring more useful information not contained in standard financial variables? Our results seem to confirm such hypothesis.

[1]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[2]  Pekka Korhonen,et al.  Good debt or bad debt: Detecting semantic orientations in economic texts , 2013, J. Assoc. Inf. Sci. Technol..

[3]  Hermann Ney,et al.  Algorithms for bigram and trigram word clustering , 1995, Speech Commun..

[4]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[5]  Peter Sarlin,et al.  Network Linkages to Predict Bank Distress , 2015 .

[6]  Alexander Clark,et al.  Combining Distributional and Morphological Information for Part of Speech Induction , 2003, EACL.

[7]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[8]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[9]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[10]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[11]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[12]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[13]  Paul Ormerod,et al.  Macroprudential FX Regulations: Shifting the Snowbanks of FX Vulnerability? , 2017, Journal of Financial Economics.

[14]  Peter Sarlin,et al.  Predicting Distress in European Banks , 2013, SSRN Electronic Journal.

[15]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[16]  E. Prescott,et al.  Postwar U.S. Business Cycles: An Empirical Investigation , 1997 .

[17]  David Bholat,et al.  Text Mining for Central Banks , 2015 .

[18]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[19]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[20]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[21]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[22]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[23]  Cindy Soo,et al.  Quantifying Animal Spirits: News Media and Sentiment in the Housing Market , 2015 .

[24]  Peter Sarlin,et al.  On policymakers’ loss functions and the evaluation of early warning systems , 2013 .

[25]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[26]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[27]  Paola Cerchiello,et al.  Twitter data models for bank risk contagion , 2017, Neurocomputing.

[28]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[29]  Peter Sarlin,et al.  Bank distress in the news: Describing events through deep learning , 2016, Neurocomputing.