Bankruptcy prediction of small and medium enterprises using a flexible binary generalized extreme value model

We introduce a binary regression accounting-based model for bankruptcy prediction of small and medium enterprises (SMEs). The main advantage of the model lies in its predictive performance in identifying defaulted SMEs. Another advantage, which is especially relevant for banks, is that the relationship between the accounting characteristics of SMEs and response is not assumed a priori (eg, linear, quadratic or cubic) and can be determined from the data. The proposed approach uses the quantile function of the generalized extreme value distribution as link function as well as smooth functions of accounting characteristics to flexibly model covariate effects. Therefore, the usual assumptions in scoring models of symmetric link function and linear or pre-specified covariate-response relationships are relaxed. Out-of-sample and out-of-time validation on Italian data shows that our proposal outperforms the commonly used (logistic) scoring model for different default horizons.

[1]  Antonella Zucchella,et al.  Internationalization and Performance , 2003 .

[2]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[3]  Hong Huang,et al.  Variable Selection for Credit Risk Model Using Data Mining technique , 2011, J. Comput..

[4]  Jonathan N. Crook,et al.  Credit Scoring and Its Applications , 2002, SIAM monographs on mathematical modeling and computation.

[5]  Rosalba Radice,et al.  A penalized likelihood estimation approach to semiparametric sample selection binary response modeling , 2013 .

[6]  Leonard N. Stern The value of non-financial information in small and medium-sized enterprise risk management , 2010 .

[7]  G. Marra,et al.  Binary generalized extreme value additive modelling , 2013 .

[8]  J. Hüsler,et al.  Laws of Small Numbers: Extremes and Rare Events , 1994 .

[9]  David J Hand,et al.  Evaluating diagnostic tests: The area under the ROC curve and the balance of errors , 2010, Statistics in medicine.

[10]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[11]  Maher Maalouf,et al.  Computational Statistics and Data Analysis Robust Weighted Kernel Logistic Regression in Imbalanced and Rare Events Data , 2022 .

[12]  M. Dietsch,et al.  Should SME exposures be treated as retail or corporate exposures? A comparative analysis of default probabilities and asset correlations in French and German SMEs , 2004 .

[13]  Allen N. Berger,et al.  A more complete conceptual framework for SME finance , 2006 .

[14]  Rosalba Radice,et al.  On the assumption of joint normality in selection models: a copula based approach applied to estimating HIV prevalence , 2015 .

[15]  Cláudia Neves,et al.  Extreme Value Distributions , 2011, International Encyclopedia of Statistical Science.

[16]  R. L. Prentice,et al.  A case-cohort design for epidemiologic cohort studies and disease prevention trials , 1986 .

[17]  Allen N. Berger,et al.  Small Business Credit Scoring and Credit Availability* , 2007 .

[18]  S. Wood Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models , 2004 .

[19]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[20]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[21]  Jake Ansell,et al.  Predicting default of a small business using different definitions of financial distress , 2012, J. Oper. Res. Soc..

[22]  J. Suykens,et al.  Linear and Non-linear Credit Scoring by Combining Logistic Regression and Support Vector Machines , 2006 .

[23]  Dipak K. Dey,et al.  Generalized extreme value regression for binary response data: An application to B2B electronic payments system adoption , 2011, 1101.1373.

[24]  Daniel Berg Bankruptcy Prediction by Generalized Additive Models , 2006 .

[25]  W. Greene,et al.  计量经济分析 = Econometric analysis , 2009 .

[26]  Mehmet Baha Karan,et al.  Credit risks and internationalization of SMEs , 2009 .

[27]  Jake Ansell,et al.  Exploring the performance of small- and medium-sized enterprises through the credit crunch , 2015, J. Oper. Res. Soc..

[28]  Silvia Angela Osmetti,et al.  Modelling small and medium enterprise loan defaults as rare events: the generalized extreme value regression model , 2013 .

[29]  Dean Fantazzini,et al.  Random Survival Forests Models for SME Credit Risk Measurement , 2009 .

[30]  Silvia Angela Osmetti,et al.  RESEARCH ARTICLE MODELLING SME LOAN DEFAULTS AS RARE EVENTS: THE GENERALIZED EXTREME VALUE REGRESSION MODEL , 2013 .

[31]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[32]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[33]  S. Wood On p-values for smooth components of an extended generalized additive model , 2013 .

[34]  Rosalba Radice,et al.  Copula regression spline models for binary outcomes , 2015, Statistics and Computing.

[35]  Alan Y. Chiang,et al.  Generalized Additive Models: An Introduction With R , 2007, Technometrics.

[36]  R. C. Merton,et al.  On the Pricing of Corporate Debt: The Risk Structure of Interest Rates , 1974, World Scientific Reference on Contingent Claims Analysis in Corporate Finance.

[37]  Francesco Ciampi,et al.  Using Economic-Financial Ratios for Small Enterprise Default Prediction Modeling: An Empirical Analysis , 2008 .

[38]  Nicholas M. Kiefer,et al.  Default Estimation and Expert Information , 2010 .

[39]  E. Altman,et al.  Modelling Credit Risk for SMEs: Evidence from the U.S. Market , 2007 .

[40]  Rosalba Radice,et al.  On the Assumption of Bivariate Normality in Selection Models: A Copula Approach Applied to Estimating HIV Prevalence , 2015, Epidemiology.

[41]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .

[42]  David J. Hand,et al.  Measuring classifier performance: a coherent alternative to the area under the ROC curve , 2009, Machine Learning.

[43]  Srinvas Gumparthi RISK CLASSIFICATION BASED ON DISCRIMINANT ANALYSIS FOR SMES , 2010 .

[44]  Jonathan Crook,et al.  Reject inference, augmentation, and sample selection , 2007, Eur. J. Oper. Res..

[45]  Lauri Ojala,et al.  Logistics and financial performance: An analysis of 424 Finnish small and medium‐sized enterprises , 2008 .

[46]  Tian-Shyug Lee,et al.  A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines , 2005, Expert Syst. Appl..

[47]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[48]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[49]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[50]  Allen N. Berger,et al.  Small business credit availability and relationship lending: the importance of bank organizational structure , 2001 .

[51]  Jih-Jeng Huang,et al.  Two-stage genetic programming (2SGP) for the credit scoring model , 2006, Appl. Math. Comput..

[52]  Gaëtan Nicodème,et al.  Taxation trends in the European Union: 1998 edition , 1998 .

[53]  Giampiero Marra,et al.  Flexible Bivariate Binary Models for Estimating the Efficacy of Phototherapy for Newborns with Jaundice , 2014 .

[54]  Jesús Saurina,et al.  The Impact of Basel II on Lending to Small- and Medium-Sized Firms: A Regulatory Policy Assessment Based on Spanish Credit Register Data , 2004 .

[55]  S. Nadarajah,et al.  Extreme Value Distributions: Theory and Applications , 2000 .

[56]  Chun-Ling Chuang,et al.  Constructing a reassigning credit scoring model , 2009, Expert Syst. Appl..

[57]  Robert Fildes,et al.  Journal of business and economic statistics 5: Garcia-Ferrer, A. et al., Macroeconomic forecasting using pooled international data, (1987), 53-67 , 1988 .

[58]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[59]  TzengGwo-Hshiung,et al.  Building credit scoring models using genetic programming , 2005 .

[60]  Giampiero Marra,et al.  Penalised regression splines: theory and application to medical research , 2010, Statistical methods in medical research.

[61]  Richard L. Smith Maximum likelihood estimation in a class of nonregular cases , 1985 .

[62]  Francesco Ciampi,et al.  Small Enterprise Default Prediction Modeling through Artificial Neural Networks: An Empirical Analysis of Italian Small Enterprises , 2013 .

[63]  Claudia Czado,et al.  The effect of link misspecification on binary regression inference , 1992 .

[64]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[65]  Dawn Hunter The value of non-financial information in SME risk management , 2010 .

[66]  S. Wood,et al.  Coverage Properties of Confidence Intervals for Generalized Additive Model Components , 2012 .

[67]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[68]  C. Zavgren,et al.  The prediction of corporate failure: The state of the art , 1983 .

[69]  Gary King,et al.  Logistic Regression in Rare Events Data , 2001, Political Analysis.

[70]  L. Becchetti,et al.  Bankruptcy Risk and Productive Efficiency in Manufacturing Firms , 2003 .

[71]  R. Jarrow,et al.  Pricing Derivatives on Financial Securities Subject to Credit Risk , 1995 .

[72]  E. Altman,et al.  Modeling Credit Risk for Smes: Evidence from the Us Market , 2005 .

[73]  Allen N. Berger,et al.  Small Business Credit Availability and Relationship Lending: The Importance of Bank Organisational Structure , 2001 .