Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports - a Neural Network Approach

This paper aims to extract both sentiment and bag-of-words information from the annual reports of U.S. banks. The sentiment analysis is based on two commonly used finance-specific dictionaries, while the bag-of-words are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal bank stock returns using a neural network with dropout regularization and rectified linear units. We show that this method outperforms other machine learning algorithms (Naive Bayes, Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns. Thus, this neural network seems to be well suited for text classification tasks working with sparse high-dimensional data. We also show that the quality of the prediction significantly increased when using the combination of financial indicators and bigrams and trigrams, respectively.

[1]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[2]  Ronen Feldman,et al.  Management's Tone Change, Post Earnings Announcement Drift and Accruals , 2009 .

[3]  Feng Li Annual Report Readability, Current Earnings, and Earnings Persistence , 2008 .

[4]  David R. Peterson,et al.  Earnings Conference Call Content and Stock Price: The Case of REITs , 2010 .

[5]  Jeremy Piger,et al.  Beyond the Numbers: Measuring the Information Content of Earnings Press Release Language*: Content of Earnings Press Release Language , 2012 .

[6]  Padmini Srinivasan,et al.  On the predictive ability of narrative disclosures in annual reports , 2010, Eur. J. Oper. Res..

[7]  Geoffrey E. Hinton,et al.  Learning a better representation of speech soundwaves using restricted boltzmann machines , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Bill McDonald,et al.  The Use of Word Lists in Textual Analysis , 2015 .

[9]  Johannes Fürnkranz,et al.  Large-Scale Multi-label Text Classification - Revisiting Neural Networks , 2013, ECML/PKDD.

[10]  Xuehua Wang,et al.  Feature selection for high-dimensional imbalanced data , 2013, Neurocomputing.

[11]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[12]  E. Henry Are Investors Influenced By How Earnings Press Releases Are Written? , 2006 .

[13]  Sa-Kwang Song,et al.  Media-aware quantitative trading based on public Web information , 2014, Decis. Support Syst..

[14]  S. Kothari,et al.  The Effect of Disclosures by Management, Analysts, and Business Press on Cost of Capital, Return Volatility, and Analyst Forecasts: A Study Using Content Analysis , 2009 .

[15]  Colm Kearney,et al.  Textual Sentiment in Finance: A Survey of Methods and Models , 2013 .

[16]  David R. Peterson,et al.  Earnings Conference Calls and Stock Returns: The Incremental Informativeness of Textual Tone , 2011 .

[17]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[18]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[19]  Petr Hájek,et al.  Evaluating Sentiment in Annual Reports for Financial Distress Prediction Using Neural Networks and Support Vector Machines , 2013, EANN.

[20]  Yang Yu,et al.  The impact of social and conventional media on firm equity value: A sentiment analysis approach , 2013, Decis. Support Syst..

[21]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[22]  Paul C. Tetlock Giving Content to Investor Sentiment: The Role of Media in the Stock Market , 2005, The Journal of Finance.

[23]  P. Hájek,et al.  Forecasting corporate financial performance using sentiment in annual reports for stakeholders’ decision-making , 2014 .

[24]  Ying Wah Teh,et al.  Text mining for market prediction: A systematic review , 2014, Expert Syst. Appl..

[25]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[26]  Petr Hájek,et al.  Intuitionistic Fuzzy Neural Network: The Case of Credit Scoring Using Text Information , 2015, EANN.

[27]  Clara Vega,et al.  The Impact of Credibility on the Pricing of Managerial Textual Content , 2014 .

[28]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[29]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[30]  Clara Vega,et al.  Soft information in earnings announcements: news or noise? , 2008 .

[31]  Jeremy Piger,et al.  Beyond the Numbers: Measuring the Information Content of Earnings Press Release Language , 2011 .

[32]  Feng Li Do Stock Market Investors Understand the Risk Sentiment of Corporate Annual Reports? , 2006 .

[33]  Shengyi Jiang,et al.  An improved K-nearest-neighbor algorithm for text categorization , 2012, Expert Syst. Appl..

[34]  Feng Li The Information Content of Forward-Looking Statements in Corporate Filings—A Naïve Bayesian Machine Learning Approach , 2010 .

[35]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.