Negative log-likelihood and statistical hypothesis testing as the basis of model selection in IDEAs

In this paper, we analyze the most prominent features m model selection criteria that have been used so far m iterated density estimation evolutionary algorithms (IDEAs, EDAs, PMBGAs) These algorithms build probabilistic models and estimate probability densities based upon a selection of available points We show that the negative log likelihood is a basis of the inference features when the Kullback-Leibler divergence is used We show how previously found to be problematic issues in the case of continuous random variables can be resolved by starting from the derived basics By doing so we have a probabilistic model search metric that can be justified through the use of statistical hypothesis tests This in turn reduces the need for additional complexity penaltie.

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[3]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[4]  D. G. Beech,et al.  The Advanced Theory of Statistics. Volume 2: Inference and Relationship. , 1962 .

[5]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[6]  Solomon Kullback,et al.  Information Theory and Statistics , 1970, The Mathematical Gazette.

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[9]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[10]  S. Baluja,et al.  Using Optimal Dependency-Trees for Combinatorial Optimization: Learning the Structure of the Search Space , 1997 .

[11]  Michèle Sebag,et al.  Extending Population-Based Incremental Learning to Continuous Search Spaces , 1998, PPSN.

[12]  Heinz Mühlenbein,et al.  FDA -A Scalable Evolutionary Algorithm for the Optimization of Additively Decomposed Functions , 1999, Evolutionary Computation.

[13]  Dirk Thierens,et al.  Linkage Information Processing In Distribution Estimation Algorithms , 1999, GECCO.

[14]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[15]  M. Pelikán,et al.  The Bivariate Marginal Distribution Algorithm , 1999 .

[16]  P. Bosman,et al.  An algorithmic framework for density estimation based evolutionary algorithms , 1999 .

[17]  P. Bosman,et al.  IDEAs based on the normal kernels probability density function , 2000 .

[18]  P. Bosman,et al.  Continuous iterated density estimation evolutionary algorithms within the IDEA framework , 2000 .

[19]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..