论文信息 - Delta Boosting Machine with Application to General Insurance

Delta Boosting Machine with Application to General Insurance

ABSTRACT In this article, we introduce Delta Boosting (DB) as a new member of the boosting family. Similar to the popular Gradient Boosting (GB), this new member is presented as a forward stagewise additive model that attempts to reduce the loss at each iteration by sequentially fitting a simple base learner to complement the running predictions. Instead of relying on the negative gradient, as is the case for GB, DB adopts a new measure called delta as the basis. Delta is defined as the loss minimizer at an observation level. We also show that DB is the optimal boosting member for a wide range of loss functions. The optimality is a consequence of DB solving for the split and adjustment simultaneously to maximize loss reduction at each iteration. In addition, we introduce an asymptotic version of DB that works well for all twice-differentiable strictly convex loss functions. This asymptotic behavior does not depend on the number of observations, but rather on a high number of iterations that can be augmented through common regularization techniques. We show that the basis in the asymptotic extension differs from the basis in GB only by a multiple of the second derivative of the log-likelihood. The multiple is considered to be a correction factor, one that corrects the bias toward the observations with high second derivatives in GB. When negative log-likelihood is used as the loss function, this correction can be interpreted as a credibility adjustment for the process variance. Simulation studies and real data application we conducted suggest that DB is a significant improvement over GB. The performance of the asymptotic version is less dramatic, but the improvement is still compelling. Like GB, DB provides a high transparency to users, and we can review the marginal influence of variables through relative importance charts and the partial dependence plots. We can also assess the overall model performance through evaluating the losses, lifts, and double lifts on the holdout sample.

Simon C. K. Lee | Sheldon Lin | Sheldon Lin

[1] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[2] Yoav Freund,et al. Boosting a weak learning algorithm by majority , 1990, COLT '90.

[3] Peter Buhlmann,et al. BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[4] X. Sheldon Lin,et al. Modeling and Evaluating Insurance Losses Via Mixtures of Erlang Distributions , 2010 .

[5] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[6] Osamu Watanabe,et al. MadaBoost: A Modification of AdaBoost , 2000, COLT.

[7] Osamu Watanabe. Algorithmic Aspects of Boosting , 2002, Progress in Discovery Science.

[8] N. Ismail,et al. Handling Overdispersion with Negative Binomial and Generalized Poisson Regression Models , 2007 .

[9] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[10] Yang Wang,et al. Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[11] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.