Mixture Composite Regression Models with Multi-type Feature Selection

The aim of this paper is to present a mixture composite regression model for claim severity modelling. Claim severity modelling poses several challenges such as multimodality, heavy-tailedness and systematic effects in data. We tackle this modelling problem by studying a mixture composite regression model for simultaneous modeling of attritional and large claims, and for considering systematic effects in both the mixture components as well as the mixing probabilities. For model fitting, we present a group-fused regularization approach that allows us for selecting the explanatory variables which significantly impact the mixing probabilities and the different mixture components, respectively. We develop an asymptotic theory for this regularized estimation approach, and fitting is performed using a novel Generalized Expectation-Maximization algorithm. We exemplify our approach on real motor insurance data set.

[1]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[2]  Tsz Chai Fung,et al.  A New Class of Severity Regression Models with an Application to IBNR Prediction , 2020 .

[3]  Enrique Calderín-Ojeda,et al.  Modeling claims data with composite Stoppa models , 2016 .

[4]  Emiliano A. Valdez,et al.  A non-convex regularization approach for stable estimation of loss development factors , 2021, Scandinavian Actuarial Journal.

[5]  Tom Reynkens,et al.  Sparse regression with Multi-type Regularized Feature modeling , 2018, Insurance: Mathematics and Economics.

[6]  Severity modeling of extreme insurance claims for tariffication , 2019, Insurance: Mathematics and Economics.

[7]  Jiahua Chen,et al.  Variable Selection in Finite Mixture of Regression Models , 2007 .

[8]  Wenyong Gui,et al.  Fitting the Erlang mixture model to data via a GEM-CMM algorithm , 2018, J. Comput. Appl. Math..

[9]  S. Vrontos,et al.  Bonus-Malus Systems with Two-Component Mixture Models Arising from Different Parametric Families , 2018 .

[10]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[11]  Gerhard Tutz,et al.  A uniform framework for the combination of penalties in generalized structured models , 2015, Advances in Data Analysis and Classification.

[12]  X. Sheldon Lin,et al.  Modeling and Evaluating Insurance Losses Via Mixtures of Erlang Distributions , 2010 .

[13]  Bettina Grün,et al.  Modeling loss data using mixtures of distributions , 2016 .

[14]  M. Denuit,et al.  Composite Lognormal–Pareto model with random threshold , 2011 .

[15]  Spyridon D. Vrontos,et al.  OPTIMAL BONUS-MALUS SYSTEMS USING FINITE MIXTURE MODELS , 2014, ASTIN Bulletin.

[16]  Malwane M. A. Ananda,et al.  Modeling actuarial data with a composite lognormal-Pareto model , 2005 .

[17]  Edward W. Frees,et al.  Regression Modeling with Actuarial and Financial Applications , 2009 .

[18]  Saralees Nadarajah,et al.  New composite models for the Danish fire insurance data , 2014 .

[19]  David P. M. Scollnik On composite lognormal-Pareto models , 2007 .

[20]  Bettina Grün,et al.  Extending composite loss models using a general framework of advanced computational tools , 2019, Scandinavian Actuarial Journal.

[21]  Himchan Jeong,et al.  An Expectation-Maximization Algorithm for the Exponential-Generalized Inverse Gaussian Regression Model with Varying Dispersion and Shape for Modelling the Aggregate Claim Amount , 2021, Risks.

[22]  Tsz Chai Fung,et al.  Fitting Censored and Truncated Regression Data Using the Mixture of Experts Models , 2020, SSRN Electronic Journal.

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  Chenchen Sun,et al.  Modeling with Weibull-Pareto Models , 2012 .

[25]  D. Karlis,et al.  AN EM ALGORITHM FOR FITTING A NEW CLASS OF MIXED EXPONENTIAL REGRESSION MODELS WITH VARYING DISPERSION , 2020 .

[26]  Abbas Khalili New estimation and feature selection methods in mixture‐of‐experts models , 2010 .

[27]  A CLASS OF MIXTURE OF EXPERTS MODELS FOR GENERAL INSURANCE: APPLICATION TO CORRELATED CLAIM FREQUENCIES , 2019, ASTIN Bulletin.

[28]  Pietro Parodi A GENERALISED PROPERTY EXPOSURE RATING FRAMEWORK THAT INCORPORATES SCALE-INDEPENDENT LOSSES AND MAXIMUM POSSIBLE LOSS UNCERTAINTY , 2020 .

[29]  Saralees Nadarajah,et al.  Modeling loss data using composite models , 2015 .

[30]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[31]  Martin Blostein,et al.  On modeling left-truncated loss data using mixtures of distributions , 2019, Insurance: Mathematics and Economics.

[32]  Hana Sevcikova,et al.  Efficient calculation of the NPMLE of a mixing distribution for mixtures of exponentials , 2006, Comput. Stat. Data Anal..