From linear to non-linear kernel based classifiers for bankruptcy prediction

Bankruptcy prediction has been a topic of research for decades, both within the financial and the academic world. The implementations of international financial and accounting standards, such as Basel II and IFRS, as well as the recent credit crisis, have accentuated this topic even further. This paper describes both regularized and non-linear kernel variants of traditional discriminant analysis techniques, such as logistic regression, Fisher discriminant analysis (FDA) and quadratic discriminant analysis (QDA). Next to a systematic description of these variants, we contribute to the literature by introducing kernel QDA and providing a comprehensive benchmarking study of these classification techniques and their regularized and kernel versions for bankruptcy prediction using 10 real-life data sets. Performance is compared in terms of binary classification accuracy, relevant for evaluating yes/no credit decisions and in terms of classification accuracy, relevant for pricing differentiated credit granting. The results clearly indicate the significant improvement for kernel variants in both percentage correctly classified (PCC) test instances and area under the ROC curve (AUC), and indicate that bankruptcy problems are weakly non-linear. On average, the best performance is achieved by LSSVM, closely followed by kernel quadratic discriminant analysis. Given the high impact of small improvements in performance, we show the relevance and importance of considering kernel techniques within this setting. Further experiments with backwards input selection improve our results even further. Finally, we experimentally investigate the relative ranking of the different categories of variables: liquidity, solvency, profitability and various, and as such provide new insights into the relative importance of these categories for predicting financial distress.

[1]  D. Mackay,et al.  Introduction to Gaussian processes , 1998 .

[2]  James Joseph Biundo,et al.  Analysis of Contingency Tables , 1969 .

[3]  Pasquale J. Di Pillo Further applications of bias to discriminant analysis , 1976 .

[4]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[5]  Amir F. Atiya,et al.  Bankruptcy prediction for credit risk using neural networks: A survey and new results , 2001, IEEE Trans. Neural Networks.

[6]  J. Friedman Regularized Discriminant Analysis , 1989 .

[7]  Hiok Chai Quek,et al.  GenSo-EWS: a novel neural-fuzzy based early warning system for predicting bank failures , 2004, Neural Networks.

[8]  Katherine Schipper,et al.  Application of Classification Techniques in Business, Banking and Finance. , 1983 .

[9]  Philip J. Brown Centering and Scaling in Ridge Regression , 1977 .

[10]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[11]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[12]  B. Efron The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .

[13]  Constantin Zopounidis,et al.  A survey of business failures with an emphasis on prediction methods and industrial applications , 1996 .

[14]  Michael Y. Hu,et al.  Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis , 1999, Eur. J. Oper. Res..

[15]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[16]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[17]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[18]  R. O. Edmister,et al.  JOURNAL OF FINANCIAL AND QUANTITATIVE ANALYSIS March 1972 AN EMPIRICAL TEST OF FINANCIAL RATIO ANALYSIS FOR SMALL BUSINESS FAILURE PREDICTION , 2009 .

[19]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[20]  Carlos Serrano-Cinca,et al.  Feedforward neural networks in the classification of financial information , 1997 .

[21]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  J. E. Boritz,et al.  Predicting Corporate Failure Using a Neural Network Approach , 1995 .

[25]  Mark J Funt Financial ratios. , 2009, Pennsylvania dental journal.

[26]  Sudhir Nanda,et al.  Linear models for minimizing misclassification costs in bankruptcy prediction , 2001, Intell. Syst. Account. Finance Manag..

[27]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[28]  Johan A. K. Suykens,et al.  Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.

[29]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[30]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[31]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[32]  N. Campbell Shrunken Estimators in Discriminant and Canonical Variate Analysis , 1980 .

[33]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[34]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[35]  Wolfgang Härdle,et al.  Applied Nonparametric Regression , 1991 .

[36]  Bart BaesensRudy Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation , 2003 .

[37]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[38]  Moshe Leshno,et al.  Neural network prediction analysis: The bankruptcy case , 1996, Neurocomputing.

[39]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[40]  Ramesh Sharda,et al.  Bankruptcy prediction using neural networks , 1994, Decis. Support Syst..

[41]  J. Swets ROC analysis applied to the evaluation of medical imaging techniques. , 1979, Investigative radiology.

[42]  E. Nadaraya On Estimating Regression , 1964 .

[43]  Tzong-Huei Lin,et al.  A cross model study of corporate financial distress prediction in Taiwan: Multiple discriminant analysis, logit, probit and neural networks models , 2009, Neurocomputing.

[44]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[45]  Melody Y. Kiang,et al.  Managerial Applications of Neural Networks: The Case of Bank Failure Predictions , 1992 .

[46]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[47]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[48]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[49]  Johan A. K. Suykens,et al.  Financial time series prediction using least squares support vector machines within the evidence framework , 2001, IEEE Trans. Neural Networks.

[50]  Chihli Hung,et al.  A selective ensemble based on expected probabilities for bankruptcy prediction , 2009, Expert Syst. Appl..

[51]  Prakasa Rao Nonparametric functional estimation , 1983 .

[52]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[53]  Johan A. K. Suykens,et al.  Faculteit Economie En Bedrijfskunde Hoveniersberg 24 B-9000 Gent Bayesian Kernel-based Classification for Financial Distress Detection Dirk Van Den Poel 4 Bayesian Kernel Based Classification for Financial Distress Detection , 2022 .

[54]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[55]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[56]  Arthur E. Hoerl,et al.  Application of ridge analysis to regression problems , 1962 .

[57]  Antanas Verikas,et al.  Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey , 2010, Soft Comput..

[58]  Vadlamani Ravi,et al.  Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks , 2009, Expert Syst. Appl..

[59]  W. Beaver Financial Ratios As Predictors Of Failure , 1966 .

[60]  Manuel Landajo,et al.  Forecasting business profitability by using classification techniques: A comparative analysis based on a Spanish case , 2005, Eur. J. Oper. Res..

[61]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[62]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[63]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[64]  H. Vinod Canonical ridge and econometrics of joint production , 1976 .

[65]  Bart Baesens,et al.  Forecasting and analyzing insurance companies' ratings , 2007 .

[66]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[67]  Manuel Landajo,et al.  Forecasting business profitability by using classification techniques: A comparative analysis based on a Spanish case , 2005, Eur. J. Oper. Res..

[68]  Kimmo Kiviluoto,et al.  Predicting bankruptcies with the self-organizing map , 1998, Neurocomputing.

[69]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[70]  Bart Baesens,et al.  Predicting going concern opinion with data mining , 2008, Decis. Support Syst..

[71]  Vadlamani Ravi,et al.  Soft computing system for bank performance prediction , 2008, Appl. Soft Comput..

[72]  G. Wahba Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV , 1999 .

[73]  Roberto Kawakami Harrop Galvão,et al.  Neural and Wavelet Network Models for Financial Distress Classification , 2005, Data Mining and Knowledge Discovery.

[74]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[75]  R. W. Farebrother Partitioned Ridge Regression , 1978 .

[76]  A. Lo,et al.  THE ECONOMETRICS OF FINANCIAL MARKETS , 1996, Macroeconomic Dynamics.

[77]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[78]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[79]  G. Wahba Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV 1 , 1998 .

[80]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[81]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[82]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[83]  Yi-Chung Hu,et al.  Functional-link net with fuzzy integral for bankruptcy prediction , 2007, Neurocomputing.

[84]  Edward I. Altman,et al.  Corporate Financial Distress and Bankruptcy , 1993 .

[85]  James A. Ohlson FINANCIAL RATIOS AND THE PROBABILISTIC PREDICTION OF BANKRUPTCY , 1980 .

[86]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[87]  Chih-Fong Tsai,et al.  Using neural network ensembles for bankruptcy prediction and credit scoring , 2008, Expert Syst. Appl..

[88]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[89]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[90]  J. Efrim Boritz,et al.  Effectiveness of neural network types for prediction of business failure , 1995 .

[91]  Edward I. Altman,et al.  Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience) , 1994 .

[92]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[93]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[94]  E. Nadaraya On Non-Parametric Estimates of Density Functions and Regression Curves , 1965 .