Bank Failure Prediction: A Comparison of Machine Learning Approaches

This paper is a comprehensive and complete research on bank failures that we examine from many different perspectives. It compromises a comprehensive dataset of ~60,000 observations for an extensive period (2005–2014) and examines different prediction horizons prior to failure. Moreover, we explore whether the addition of variables related to the diversification of the banks’ activities along with local effects, improve the predictability of the models. Seven popular and widely used machine learning techniques are compared under different performance metrics, using a bootstrap analysis. The results show that mid to long-term prediction improves significantly with the addition of diversification variables. Local effects exist and further improve the results, while, support vector machines, gradient boosting, and random forests outperform traditional models with the performance differences increasing over longer prediction horizons.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Sunita Soni,et al.  Weighted Naive Bayes Classifier: A Predictive Model for Breast Cancer Detection , 2016 .

[3]  Marco Arena,et al.  Bank Failures and Bank Fundamentals: A Comparative Analysis of Latin America and East Asia during the Nineties using Bank-Level Data , 2008 .

[4]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[5]  Ross Levine,et al.  Is There a Diversification Discount in Financial Conglomerates? , 2005 .

[6]  J. Friedman Stochastic gradient boosting , 2002 .

[7]  M. Flannery Using Market Information in Prudential Bank Supervision: A Review of the U.S. Empirical Evidence , 1998 .

[8]  Iván Pastor Sanz,et al.  Bankruptcy visualization and prediction using neural networks: A study of U.S. commercial banks , 2015, Expert Syst. Appl..

[9]  Houkuan Huang,et al.  Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[10]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[13]  Edward I. Altman,et al.  THE PREDICTION OF CORPORATE BANKRUPTCY: A DISCRIMINANT ANALYSIS* , 1968 .

[14]  Vadlamani Ravi,et al.  Bankruptcy prediction in banks and firms via statistical and intelligent techniques - A review , 2007, Eur. J. Oper. Res..

[15]  David C. Yen,et al.  A comparative study of classifier ensembles for bankruptcy prediction , 2014, Appl. Soft Comput..

[16]  P.P.M. Joos,et al.  Financial distress models in Belgium: the results of a decade of empirical research , 1995 .

[17]  Raymond A. K. Cox,et al.  Predicting the US bank failure: A discriminant analysis , 2014 .

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Jonathan N. Crook,et al.  Recent developments in consumer credit risk assessment , 2007, Eur. J. Oper. Res..

[20]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[21]  Christophe Mues,et al.  An experimental comparison of classification algorithms for imbalanced credit scoring data sets , 2012, Expert Syst. Appl..

[22]  Azuraliza Abu Bakar,et al.  Medical data classification with Naive Bayes approach , 2012 .

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[25]  René M. Stulz,et al.  The Credit Crisis Around the Globe: Why Did Some Banks Perform Better? , 2011 .

[26]  Peter Sarlin,et al.  Predicting Distress in European Banks , 2013, SSRN Electronic Journal.

[27]  Hussein A. Abdou,et al.  Neural nets versus conventional techniques in credit scoring in Egyptian banking , 2008, Expert Syst. Appl..

[28]  S. McLeay,et al.  THE SENSITIVITY OF PREDICTION MODELS TO THE NON-NORMALITY OF BOUNDED AND UNBOUNDED FINANCIAL RATIOS☆ , 2000 .

[29]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[30]  Francisco José Climent Diranzo,et al.  Predicting failure in the U.S. banking sector: An extreme gradient boosting approach , 2019, International Review of Economics & Finance.

[31]  Constantin Zopounidis,et al.  Multicriteria Decision Aid Methods for the Prediction of Business Failure , 1998 .

[32]  Olivier De Jonghe,et al.  Does the Stock Market Value Bank Diversification? , 2006 .

[33]  W. Beaver Financial Ratios As Predictors Of Failure , 1966 .

[34]  Chrysovalantis Gaganis,et al.  Bank Diversification and Overall Financial Strength: International Evidence , 2016 .

[35]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[36]  Sumit Sarkar,et al.  Bayesian Models for Early Warning of Bank Failures , 2001, Manag. Sci..

[37]  Mu-Yen Chen,et al.  Predicting corporate financial distress based on integration of decision tree classification and logistic regression , 2011, Expert Syst. Appl..

[38]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[39]  I. Hasan,et al.  Financial Crises and Bank Failures: A Review of Prediction Methods , 2009 .

[40]  Lawrence J. White,et al.  Déjà Vu All Over Again: The Causes of U.S. Commercial Bank Failures This Time Around , 2011, Journal of Financial Services Research.

[41]  Vladimir Naumovich Vapni The Nature of Statistical Learning Theory , 1995 .

[42]  Zeineb Affes,et al.  Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis , 2016 .

[43]  E. Altman,et al.  Modelling Credit Risk for SMEs: Evidence from the U.S. Market , 2007 .

[44]  Sunita Soni,et al.  Naive Bayes Classifiers: A Probabilistic Detection Model for Breast Cancer , 2014 .

[45]  Vijay S. Desai,et al.  A comparison of neural networks and linear scoring models in the credit union environment , 1996 .

[46]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[47]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Shigeyuki Hamori,et al.  Random forests-based early warning system for bank failures , 2016 .

[49]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[50]  Gregory R. Madey,et al.  The Application of Neural Networks and a Qualitative Response Model to the Auditor's Going Concern Uncertainty Decision* , 1995 .

[51]  James W. Kolari,et al.  Predicting large US commercial bank failures , 2002 .

[52]  S. Konishi,et al.  Robust logistic regression modelling via the elastic net-type regularization and tuning parameter selection , 2016 .

[53]  Constantin Zopounidis,et al.  THE USE OF THE PREFERENCE DISAGGREGATION ANALYSIS IN THE ASSESSMENT OF FINANCIAL RISKS , 1998 .

[54]  Roberta E. Martin,et al.  A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping , 2014, PloS one.

[55]  Sreedhar T. Bharath,et al.  Forecasting Default with the Merton Distance to Default Model , 2008 .

[56]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[57]  David J. Hand,et al.  Measuring classifier performance: a coherent alternative to the area under the ROC curve , 2009, Machine Learning.

[58]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[59]  Hyejin Park,et al.  Parametric models and non-parametric machine learning models for predicting option prices: Empirical comparison study over KOSPI 200 Index options , 2014, Expert Syst. Appl..

[60]  Jurandy Almeida,et al.  Spam filtering: how the dimensionality reduction affects the accuracy of Naive Bayes classifiers , 2011, Journal of Internet Services and Applications.

[61]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[62]  Theodore B. Trafalis,et al.  A hybrid model for exchange rate prediction , 2006, Decis. Support Syst..

[63]  L. Becchetti,et al.  Bankruptcy risk and productive efficiency in manufacturing firms , 2003 .

[64]  Sean Cleary,et al.  An efficient and functional model for predicting bank distress: In and out of sample evidence , 2016 .

[65]  Kadri Männasoo,et al.  Explaining bank distress in Eastern European transition economies. , 2009 .