GADF — Genetic Algorithms for distribution fitting

Distribution fitting is a widely recurring problem in different fields such as telecommunication, finance and economics, sociology, physics, etc. Standard methods often require solving difficult equations systems or investments in specialized software. The paper presents a new approach to distribution fitting that exploits Genetic Algorithms in order to simultaneously identify the distribution type and tune its parameters by exploiting a dataset sampled from the observed random variable and a set of distribution families. The strength of this approach lies in the easiness of the expansion of this set: in fact distributions are simply described by means of their probability density functions and cumulative distribution functions, which are well-known. This approach employs two different score metrics, the Mean Absolute Error and the Kolmogorov-Smirnov test, that are linearly combined in order to find the best fitting distribution. The results obtained in an industrial application are presented and discussed.

[1]  F. Pukelsheim The Three Sigma Rule , 1994 .

[2]  A. M. Razali,et al.  Fitting of statistical distributions to wind speed data in Malaysia , 2009 .

[3]  Pandu R. Tadikamalla,et al.  A Look at the Burr and Related Distributioni , 1980 .

[4]  P. J. Anderson,et al.  Population dynamics of a tropical palm: use of a genetic algorithm for inverse parameter estimation , 2004 .

[5]  Johann Christoph Strelen The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models , 2003 .

[6]  Wen Lea Pearn,et al.  Assessing the statistical characteristics of the mean absolute error or forecasting , 1991 .

[7]  J. Green,et al.  On order statistics from thelog-logistic distribution and their properties , 1984 .

[8]  L. L. Cam,et al.  Maximum likelihood : an introduction , 1990 .

[9]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[10]  I. J. Myung,et al.  Tutorial on maximum likelihood estimation , 2003 .

[11]  Herbert A. Sturges,et al.  The Choice of a Class Interval , 1926 .

[12]  Alain Dussauchoy,et al.  Generalized extreme value distribution for fitting opening/closing asset prices and returns in stock-exchange , 2006, Oper. Res..

[13]  Heinz Falk,et al.  Statistical evaluation of single sparks , 1998 .

[14]  Pier Francesco Perri,et al.  Some developments on the log-Dagum distribution , 2009, Stat. Methods Appl..

[15]  Raul H. C. Lopes,et al.  A two-dimensional Kolmogorov-Smirnov test , 2009 .

[16]  S. Nadarajah,et al.  Extreme Value Distributions: Theory and Applications , 2000 .

[17]  G. Hamedani,et al.  On the Determination of the Bivariate Normal Distribution from Distributions of Linear Combinations of the Variables , 1975 .