Grabit: Gradient tree-boosted Tobit models for default prediction

In this paper, a novel approach for default prediction is presented. An often encountered problem in default prediction is the fact that there is relatively little default data since bankruptcies are usually rare events. We show how this class imbalance issue can be alleviated by the use of a novel model, the so called Grabit model which is obtained by applying gradient tree boosting to the Tobit model, and additional data. We use the Grabit model for predicting defaults on loans made to Swiss small and medium-sized enterprises (SME) and obtain a large improvement in predictive performance compared to other state-of-the-art approaches. In contrast to the Tobit model, the Grabit model can account for general forms of non-linearities and interactions, it is robust against outliers in covariates and scale invariant to monotonic transformations for the covariates, and its predictive performance is not impaired by multicollinearity.

[1]  J. Swart,et al.  Financial Aspects of Corporate Net Worth.@@@Changes in the Financial Structure of Unsuccessful Industrial Corporations. , 1936 .

[2]  B. Sansó,et al.  Venezuelan Rainfall Data Analysed by Using a Bayesian Space–time Model , 1999 .

[3]  J. Tobin Estimation of Relationships for Limited Dependent Variables , 1958 .

[4]  Vineet Agarwal,et al.  Are hazard models superior to traditional bankruptcy prediction approaches? A comprehensive test , 2014 .

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Takeshi Amemiya,et al.  The nonlinear two-stage least-squares estimator , 1974 .

[7]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[8]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[9]  Jeffrey M. Wooldridge,et al.  The Initial Conditions Problem in Dynamic, Nonlinear Panel Data Models with Unobserved Heterogeneity , 2002 .

[10]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[11]  Hui Zou,et al.  Insurance Premium Prediction via Gradient Tree-Boosted Tweedie Compound Poisson Models , 2015, 1508.06378.

[12]  Christophe Mues,et al.  An experimental comparison of classification algorithms for imbalanced credit scoring data sets , 2012, Expert Syst. Appl..

[13]  Abbas Ahmadi,et al.  Modeling corporate customers’ credit risk considering the ensemble approaches in multiclass classification: evidence from Iranian corporate credits , 2016 .

[14]  F. Nelson,et al.  ESTIMATION OF THE TWO-LIMIT PROBIT REGRESSION MODEL , 1975 .

[15]  M. Zmijewski METHODOLOGICAL ISSUES RELATED TO THE ESTIMATION OF FINANCIAL DISTRESS PREDICTION MODELS , 1984 .

[16]  Edward I. Altman,et al.  Bankruptcy, Credit Risk, and High Yield Junk Bonds , 2002 .

[17]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[18]  J. Heckman The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models , 1976 .

[19]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[20]  A Mayr,et al.  The Evolution of Boosting Algorithms , 2014, Methods of Information in Medicine.

[21]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[22]  Yan Yu,et al.  Variable selection and corporate bankruptcy forecasts , 2015 .

[23]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[24]  Yufei Xia,et al.  A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring , 2017, Expert Syst. Appl..

[25]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[26]  Jakub M. Tomczak,et al.  Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction , 2016, Expert Syst. Appl..

[27]  J. Powell,et al.  Least absolute deviations estimation for the censored regression model , 1984 .

[28]  Leslie E. Papke,et al.  Econometric Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates , 1993 .

[29]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[30]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[31]  Torsten Hothorn,et al.  Model-based Boosting 2.0 , 2010, J. Mach. Learn. Res..

[32]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[33]  W. Beaver Financial Ratios As Predictors Of Failure , 1966 .

[34]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..

[35]  P. Schmidt,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1984 .

[36]  Tyler Shumway Forecasting Bankruptcy More Accurately: A Simple Hazard Model , 1999 .

[37]  Siem Jan Koopman,et al.  The Multi-State Latent Factor Intensity Model for Credit Rating Transitions , 2005 .

[38]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[39]  Hedley Rees,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1985 .

[40]  Simone Giacosa,et al.  Investigating the use of gradient boosting machine, random forest and their ensemble to predict skin flavonoid content from berry physical-mechanical characteristics in wine grapes , 2015, Comput. Electron. Agric..

[41]  Marco Sandri,et al.  A Bias Correction Algorithm for the Gini Variable Importance Measure in Classification Trees , 2008 .

[42]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[43]  Fabio Sigrist,et al.  A dynamic nonstationary spatio-temporal model for short term prediction of precipitation , 2011, 1102.4210.

[44]  Fabio Sigrist,et al.  KTBoost: Combined Kernel and Tree Boosting , 2019, Neural Processing Letters.

[45]  Yan Yu,et al.  A Class of Discrete Transformation Survival Models With Application to Default Probability Prediction , 2012 .

[46]  B. Yu,et al.  Boosting with the L_2-Loss: Regression and Classification , 2001 .

[47]  David Johnstone,et al.  Predicting Corporate Bankruptcy: An Evaluation of Alternative Statistical Frameworks , 2017 .

[48]  David Johnstone,et al.  An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes , 2015 .

[49]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[50]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[51]  Jonathan Crook,et al.  Incorporating heterogeneity and macroeconomic variables into multi-state delinquency models for credit cards , 2018, Eur. J. Oper. Res..

[52]  So Young Sohn,et al.  Support vector machines for default prediction of SMEs based on technology credit , 2010, Eur. J. Oper. Res..

[53]  David A. Elizondo,et al.  Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks , 2008, Decis. Support Syst..

[54]  E. Altman,et al.  Modelling Credit Risk for SMEs: Evidence from the U.S. Market , 2007 .

[55]  Charles L. Merwin,et al.  Financing Small Corporations in Five Manufacturing Industries, 1926-36. , 1944 .

[56]  A. Lau A 5-State Financial Distress Prediction Model , 1987 .

[57]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[58]  Fabio Sigrist,et al.  Gradient and Newton Boosting for Classification and Regression , 2018, Expert Syst. Appl..

[59]  P. G. Moffatt,et al.  Hurdle models of loan default , 2005, J. Oper. Res. Soc..

[60]  R. Tibshirani,et al.  Additive Logistic Regression : a Statistical View ofBoostingJerome , 1998 .

[61]  Jodi L. Gissel,et al.  A Review of Bankruptcy Prediction Studies: 1930 to Present , 2006 .

[62]  W. Stahel,et al.  Using The Censored Gamma Distribution for Modeling Fractional Response Variables with an Application to Loss Given Default , 2010, 1011.1796.

[63]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[64]  Leslie E. Papke,et al.  Panel data methods for fractional response variables with an application to test pass rates , 2008 .

[65]  Dean Fantazzini,et al.  Random Survival Forests Models for SME Credit Risk Measurement , 2009 .