Hurdle models of loan default

Some models of loan default are binary, simply modelling the probability of default, while others go further and model the extent of default (eg number of outstanding payments; amount of arrears). The double-hurdle model, originally due to Cragg (Econometrica, 1971), and conventionally applied to household consumption or labour supply decisions, contains two equations, one which determines whether or not a customer is a potential defaulter (the ‘first hurdle’), and the other which determines the extent of default. In separating these two processes, the model recognizes that there exists a subset of the observed non-defaulters who would never default whatever their circumstances. A Box-Cox transformation applied to the dependent variable is a useful generalization to the model. Estimation is relatively easy using the Maximum Likelihood routine available in STATA. The model is applied to a sample of 2515 loan applicants for whom loans were approved, a sizeable proportion of whom defaulted in varying degrees. The dependent variables used are amount in arrears and number of days in arrears. The value of the hurdle approach is confirmed by finding that certain key explanatory variables have very different effects between the two equations. Most notably, the effect of loan amount is strongly positive on arrears, while being U-shaped on the probability of default. The former effect is seriously under-estimated when the first hurdle is ignored.