A global optimisation approach for parameter estimation of a mixture of double Pareto lognormal and lognormal distributions

The double Pareto Lognormal (dPlN) statistical distribution, defined in terms of both an exponentiated skewed Laplace distribution and a lognormal distribution, has proven suitable for fitting heavy tailed data. In this work we investigate inference for the mixture of a dPlN component and ( k - 1 ) lognormal components for k fixed, a model for extreme and skewed data which additionally captures multimodality.The optimisation criterion based on the likelihood maximisation is considered, which yields a global optimisation problem with an objective function difficult to evaluate and optimise. Variable Neighbourhood Search (VNS) is proven to be a powerful tool to overcome such difficulties. Our approach is illustrated with both simulated and real data, in which our VNS and a standard multistart are compared. The computational experience shows that the VNS is more stable numerically and provides slightly better objective values.

[1]  T. Tchumatchenko,et al.  Competition and fragmentation: a simple model generating lognormal-like distributions , 2008, 0810.2403.

[2]  F. Glover HEURISTICS FOR INTEGER PROGRAMMING USING SURROGATE CONSTRAINTS , 1977 .

[3]  Wing-Tong Yu,et al.  On a proper way to select population failure distribution and a stochastic optimization method in parameter estimation , 2007, Eur. J. Oper. Res..

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[6]  E. J. Gumbel,et al.  Statistics of Extremes. , 1960 .

[7]  Mirjana Cangalovic,et al.  General variable neighborhood search for the continuous optimization , 2006, Eur. J. Oper. Res..

[8]  Patricia Román-Román,et al.  Estimating the parameters of a Gompertz-type diffusion process by means of Simulated Annealing , 2012, Appl. Math. Comput..

[9]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[10]  Nenad Mladenovic,et al.  Gaussian variable neighborhood search for continuous optimization , 2011, Comput. Oper. Res..

[11]  Yu-Hsin Liu,et al.  Incorporating scatter search and threshold accepting in finding maximum likelihood estimates for the multinomial probit model , 2011, Eur. J. Oper. Res..

[12]  José A. Díaz-García,et al.  A global simulated annealing heuristic for the three-parameter lognormal maximum likelihood estimation , 2008, Computational Statistics & Data Analysis.

[13]  Pierre Hansen,et al.  Variable neighborhood search , 1997, Eur. J. Oper. Res..

[14]  Alberto Luceño,et al.  Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators , 2006, Comput. Stat. Data Anal..

[15]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[16]  Panos M. Pardalos,et al.  Handbook of Optimization in Complex Networks , 2012 .

[17]  J. Teugels,et al.  Statistics of Extremes , 2004 .

[18]  B. Turnbull,et al.  Adaptive sequential procedures for selecting the best of several normal populations , 1978 .

[19]  Pierre Hansen,et al.  Variable Neighborhood Search : Methods and Applications , 2008 .

[20]  Simon P. Wilson,et al.  Bayesian inference for double Pareto lognormal queues , 2010, 1011.3411.

[21]  Igor Vasil'ev,et al.  A computational study of a nonlinear minsum facility location problem , 2012, Comput. Oper. Res..

[22]  Pierre Hansen,et al.  Variable neighborhood search: Principles and applications , 1998, Eur. J. Oper. Res..

[23]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[24]  Jan R. Magnus,et al.  Maximum Likelihood Estimation of the Multivariate Normal Mixture Model , 2009 .

[25]  William J. Reed,et al.  The Double Pareto-Lognormal Distribution—A New Parametric Model for Size Distributions , 2004, WWW 2001.

[26]  Thomas F. Coleman,et al.  An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds , 1993, SIAM J. Optim..

[27]  Pierre Hansen,et al.  Finding maximum likelihood estimators for the three-parameter Weibull distribution , 1994, J. Glob. Optim..

[28]  Seyed Taghi Akhavan Niaki,et al.  A hybrid variable neighborhood search and simulated annealing algorithm to estimate the three parameters of the Weibull distribution , 2011, Expert Syst. Appl..

[29]  Weibo Gong,et al.  Double Pareto Lognormal Distributions in Complex Networks , 2012 .

[30]  L. Leemis,et al.  Minimum Kolmogorov–Smirnov test statistic parameter estimates , 2006 .

[31]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[32]  P. Hansen,et al.  Variable neighbourhood search: methods and applications , 2010, Ann. Oper. Res..

[33]  John D. Kalbfleisch,et al.  Penalized minimum‐distance estimates in finite mixture models , 1996 .

[34]  Volodymyr Melnykov,et al.  Finite mixture models and model-based clustering , 2010 .

[35]  W. R. Schucany,et al.  Minimum Distance and Robust Estimation , 1980 .

[36]  Chuan Lu,et al.  An investigation into the population abundance distribution of mRNAs, proteins, and metabolites in biological systems , 2009, Bioinform..