Mixture modeling of data with multiple partial right-censoring levels

In this paper, a new flexible approach to modeling data with multiple partial right-censoring points is proposed. This method is based on finite mixture models, flexible tool to model heterogeneity in data. A general framework to accommodate partial censoring is considered. In this setting, it is assumed that a certain portion of data points are censored and the rest are not. This situation occurs in many insurance loss data sets. A novel probability function is proposed to be used as a mixture component and the expectation-maximization algorithm is employed for estimating model parameters. The Bayesian information criterion is used for model selection. Additionally, an approach for the variability assessment of parameter estimates as well as the computation of quantiles commonly known as risk measures is considered. The proposed model is evaluated using a simulation study based on four common probability distribution functions used to model right skewed loss data and applied to a real data set with good results.

[1]  Stuart A. Klugman,et al.  Loss Models: From Data to Decisions , 1998 .

[2]  Saralees Nadarajah,et al.  Modeling loss data using composite models , 2015 .

[3]  G. McLachlan,et al.  Fitting mixture models to grouped and truncated data via the EM algorithm. , 1988, Biometrics.

[4]  Bettina Grün,et al.  Modeling loss data using mixtures of distributions , 2016 .

[5]  Gyemin Lee,et al.  EM algorithms for multivariate Gaussian mixture models with truncated and censored data , 2012, Comput. Stat. Data Anal..

[6]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[7]  X. Sheldon Lin,et al.  Modeling and Evaluating Insurance Losses Via Mixtures of Erlang Distributions , 2010 .

[8]  Malwane M. A. Ananda,et al.  Modeling actuarial data with a composite lognormal-Pareto model , 2005 .

[9]  Debanjan Mitra,et al.  Likelihood inference for lognormal data with left truncation and right censoring with an illustratio , 2011 .

[10]  Martin Blostein,et al.  On modeling left-truncated loss data using mixtures of distributions , 2019, Insurance: Mathematics and Economics.

[11]  Stuart A. Klugman,et al.  Fitting bivariate loss distributions with copulas , 1999 .

[12]  M. Denuit,et al.  Composite Lognormal–Pareto model with random threshold , 2011 .

[13]  A. McNeil Estimating the Tails of Loss Severity Distributions Using Extreme Value Theory , 1997, ASTIN Bulletin.

[14]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[15]  Laurent Bordes,et al.  Stochastic EM algorithms for parametric and semiparametric mixture models for right-censored lifetime data , 2016, Comput. Stat..

[16]  Sidney I. Resnick,et al.  Discussion of the Danish Data on Large Fire Insurance Losses , 1997, ASTIN Bulletin.

[17]  Roel Verbelen,et al.  FITTING MIXTURES OF ERLANGS TO CENSORED AND TRUNCATED DATA USING THE EM ALGORITHM , 2014, ASTIN Bulletin.

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[19]  David P. M. Scollnik On composite lognormal-Pareto models , 2007 .

[20]  Emiliano A. Valdez,et al.  Understanding Relationships Using Copulas , 1998 .

[21]  Debanjan Mitra,et al.  Left truncated and right censored Weibull data and likelihood inference with an illustration , 2012, Comput. Stat. Data Anal..

[22]  D. Chauveau A stochastic EM algorithm for mixtures with censored data , 1995 .

[23]  Wenyong Gui,et al.  Fitting the Erlang mixture model to data via a GEM-CMM algorithm , 2018, J. Comput. Appl. Math..

[24]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..

[25]  Enrique Calderín-Ojeda,et al.  Modeling claims data with composite Stoppa models , 2016 .

[26]  Xue Ye,et al.  Consistent test for parametric models with right-censored data using projections , 2018, Comput. Stat. Data Anal..

[27]  Sheldon M. Ross Introduction to Probability Models. , 1995 .

[28]  Sungwan Bang,et al.  Simultaneous estimation for non-crossing multiple quantile regression with right censored data , 2014, Statistics and Computing.

[29]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[30]  Debanjan Mitra,et al.  Likelihood Inference Based on Left Truncated and Right Censored Data From a Gamma Distribution , 2013, IEEE Transactions on Reliability.

[31]  Volodymyr Melnykov,et al.  An effective strategy for initializing the EM algorithm in finite mixture models , 2016, Advances in Data Analysis and Classification.