Machine learning models and bankruptcy prediction

Machine learning models show improved bankruptcy prediction accuracy over traditional models.Various models were tested using different accuracy metrics.Boosting, bagging, and random forest models provide better results. There has been intensive research from academics and practitioners regarding models for predicting bankruptcy and default events, for credit risk management. Seminal academic research has evaluated bankruptcy using traditional statistics techniques (e.g. discriminant analysis and logistic regression) and early artificial intelligence models (e.g. artificial neural networks). In this study, we test machine learning models (support vector machines, bagging, boosting, and random forest) to predict bankruptcy one year prior to the event, and compare their performance with results from discriminant analysis, logistic regression, and neural networks. We use data from 1985 to 2013 on North American firms, integrating information from the Salomon Center database and Compustat, analysing more than 10,000 firm-year observations. The key insight of the study is a substantial improvement in prediction accuracy using machine learning techniques especially when, in addition to the original Altmans Z-score variables, we include six complementary financial indicators. Based on Carton and Hofer (2006), we use new variables, such as the operating margin, change in return-on-equity, change in price-to-book, and growth measures related to assets, sales, and number of employees, as predictive variables. Machine learning models show, on average, approximately 10% more accuracy in relation to traditional models. Comparing the best models, with all predictive variables, the machine learning technique related to random forest led to 87% accuracy, whereas logistic regression and linear discriminant analysis led to 69% and 50% accuracy, respectively, in the testing sample. We find that bagging, boosting, and random forest models outperform the others techniques, and that all prediction accuracy in the testing sample improves when the additional variables are included. Our research adds to the discussion of the continuing debate about superiority of computational methods over statistical techniques such as in Tsai, Hsu, and Yen (2014) and Yeh, Chi, and Lin (2014). In particular, for machine learning mechanisms, we do not find SVM to lead to higher accuracy rates than other models. This result contradicts outcomes from Danenas and Garsva (2015) and Cleofas-Sanchez, Garcia, Marques, and Senchez (2016), but corroborates, for instance, Wang, Ma, and Yang (2014), Liang, Lu, Tsai, and Shih (2016), and Cano etal. (2017). Our study supports the applicability of the expert systems by practitioners as in Heo and Yang (2014), Kim, Kang, and Kim (2015) and Xiao, Xiao, and Wang (2016).

[1]  Jon Atli Benediktsson,et al.  Automatic selection of molecular descriptors using random forest: Application to drug discovery , 2017, Expert Syst. Appl..

[2]  Rudrajeet Pal,et al.  Business health characterization: A hybrid regression and support vector machine analysis , 2016, Expert Syst. Appl..

[3]  Jian Ma,et al.  Two credit scoring models based on dual strategy ensemble trees , 2012, Knowl. Based Syst..

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  José Salvador Sánchez,et al.  Financial distress prediction using the hybrid associative memory with translation , 2016, Appl. Soft Comput..

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[7]  Michael D. Vose,et al.  No Free Lunch and Benchmarks , 2013, Evolutionary Computation.

[8]  Jing Chen,et al.  Financial Distress and Idiosyncratic Volatility: An Empirical Investigation , 2010 .

[9]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[10]  Donald P. Cram,et al.  Assessing the Probability of Bankruptcy , 2002 .

[11]  Jens Leker,et al.  Credit risk prediction using support vector machines , 2011 .

[12]  Yu. L. Muromtsev,et al.  Assessing the probability of pit fires , 1969 .

[13]  Ponnuthurai N. Suganthan,et al.  Modeling of steelmaking process with effective machine learning techniques , 2015, Expert Syst. Appl..

[14]  Hyejin Park,et al.  Parametric models and non-parametric machine learning models for predicting option prices: Empirical comparison study over KOSPI 200 Index options , 2014, Expert Syst. Appl..

[15]  Susan G. Watts,et al.  Bankruptcy classification errors in the 1980s: An empirical analysis of Altman's and Ohlson's models , 1996 .

[16]  Luca Calderoni,et al.  Indoor localization in a hospital environment using Random Forest classifiers , 2015, Expert Syst. Appl..

[17]  Michael Lemmon,et al.  Book‐to‐Market Equity, Distress Risk, and Stock Returns , 2002 .

[18]  Ching-Chiang Yeh,et al.  Going-concern prediction using hybrid random forests and rough set approach , 2014, Inf. Sci..

[19]  Huosheng Hu,et al.  Support Vector Machine-Based Classification Scheme for Myoelectric Control Applied to Upper Limb , 2008, IEEE Transactions on Biomedical Engineering.

[20]  Silvia Figini,et al.  Corporate Default Prediction Model Averaging: A Normative Linear Pooling Approach , 2016, Intell. Syst. Account. Finance Manag..

[21]  Jinyong Yang,et al.  AdaBoost based bankruptcy forecasting of Korean construction companies , 2014, Appl. Soft Comput..

[22]  Pedro Lorca Financial Distress Prediction: The Way Forward , 2012 .

[23]  Hui Li,et al.  Gaussian case-based reasoning for business failure prediction with empirical data in China , 2009, Inf. Sci..

[24]  Arun Upneja,et al.  An examination of capital structure in the restaurant industry. , 2000 .

[25]  Asghar Ali,et al.  Macroeconomic determinants of credit risk: Recent evidence from a cross country study , 2010 .

[26]  H. Karimi,et al.  Output Feedback Control of Discrete Impulsive Switched Systems with State Delays and Missing Measurements , 2013 .

[27]  Wuyi Yue,et al.  Support vector machine based multiagent ensemble learning for credit risk evaluation , 2010, Expert Syst. Appl..

[28]  Mingliang Wang,et al.  Prediction of Banking Systemic Risk Based on Support Vector Machine , 2013 .

[29]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Paulius Danenas,et al.  Selection of Support Vector Machines based classifiers for credit risk domain , 2015, Expert Syst. Appl..

[31]  Chunchi Wu,et al.  Default prediction with dynamic sectoral and macroeconomic frailties. , 2014 .

[32]  Ekrem Duman,et al.  Detecting credit card fraud by Modified Fisher Discriminant Analysis , 2015, Expert Syst. Appl..

[33]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[34]  Diana Bonfim Credit risk drivers: Evaluating the contribution of firm level information and of macroeconomic dynamics , 2009 .

[35]  James A. Ohlson FINANCIAL RATIOS AND THE PROBABILISTIC PREDICTION OF BANKRUPTCY , 1980 .

[36]  Jian Ma,et al.  An improved boosting based on feature selection for corporate bankruptcy prediction , 2014, Expert Syst. Appl..

[37]  Yong Shi,et al.  Recent advances on support vector machines research , 2012 .

[38]  Philippe du Jardin,et al.  A two-stage classification technique for bankruptcy prediction , 2016, Eur. J. Oper. Res..

[39]  Trevor Hastie,et al.  Computer Age Statistical Inference: Algorithms, Evidence, and Data Science , 2016 .

[40]  Iván Pastor Sanz,et al.  Bankruptcy visualization and prediction using neural networks: A study of U.S. commercial banks , 2015, Expert Syst. Appl..

[41]  Robert Carton,et al.  Measuring organizational performance , 2006 .

[42]  Ting-Wen Chang,et al.  Learning style Identifier: Improving the precision of learning style identification through computational intelligence algorithms , 2017, Expert Syst. Appl..

[43]  Fabio Ciravegna,et al.  Evaluating machine learning for information extraction , 2005, ICML.

[44]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[45]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[46]  Dae-Ki Kang,et al.  Ensemble with neural networks for bankruptcy prediction , 2010, Expert Syst. Appl..

[47]  Dae-Ki Kang,et al.  Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction , 2015, Expert Syst. Appl..

[48]  Funda Yurdakul,et al.  Macroeconomic Modelling of Credit Risk for Banks , 2014 .

[49]  Deron Liang,et al.  Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study , 2016, Eur. J. Oper. Res..

[50]  Andreas Ziegler,et al.  Consumer credit risk: Individual probability estimates using machine learning , 2013, Expert Syst. Appl..

[51]  Kin Keung Lai,et al.  Bankruptcy prediction using SVM models with a new approach to combine features selection and parameter optimisation , 2014, Int. J. Syst. Sci..

[52]  David C. Yen,et al.  A comparative study of classifier ensembles for bankruptcy prediction , 2014, Appl. Soft Comput..

[53]  Ash Booth,et al.  Automated trading with performance weighted random forests and seasonality , 2014, Expert Syst. Appl..

[54]  Diana Bonfim Credit Risk Drivers: Evaluating the Contribution of Firm Level Information and of Macroeconomic Dynamics , 2009 .

[55]  Yu Wang,et al.  Ensemble classification based on supervised clustering for credit scoring , 2016, Appl. Soft Comput..

[56]  Jian Ma,et al.  A comparative assessment of ensemble learning for credit scoring , 2011, Expert Syst. Appl..

[57]  Ammar Belatreche,et al.  Evaluating machine learning classification for financial trading: An empirical approach , 2016, Expert Syst. Appl..

[58]  Loris Nanni,et al.  An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring , 2009, Expert Syst. Appl..

[59]  Byeong Ho Kang,et al.  Investigation and improvement of multi-layer perception neural networks for credit scoring , 2015, Expert Syst. Appl..

[60]  S. Kim,et al.  Predicting restaurant financial distress using decision tree and AdaBoosted decision tree models , 2014 .

[61]  Rommel M. Barbosa,et al.  Comparative study of data mining techniques for the authentication of organic grape juice based on ICP-MS analysis , 2016, Expert Syst. Appl..

[62]  Abdulhamit Subasi,et al.  EEG signal classification using PCA, ICA, LDA and support vector machines , 2010, Expert Syst. Appl..

[63]  William Stafford Noble,et al.  Support vector machine , 2013 .

[64]  Marcelo Ângelo Cirillo,et al.  Data classification with binary response through the Boosting algorithm and logistic regression , 2017, Expert Syst. Appl..

[65]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.