The limitations of distribution sampling for linkage learning

This paper investigates the performance of estimation of distribution algorithms (EDAs) over binary test problems containing parity functions. We describe two test problems; the concatenated parity function (CPF), and the concatenated parity/trap function (CP/TF). Although these functions are separable, with bounded complexity and uniformly scaled sub-function contributions, the hierarchical Bayesian optimization algorithm (hBOA) scales exponentially on both. hBOA is able to solve large CPFs with small population sizes when it is unable to solve them with larger population sizes. We argue that test problems containing parity functions are hard for EDAs because there are no interactions in the contribution to fitness between any strict subset of a parity function's bits. This means that as population sizes increase the dependency between variable values for any strict subset of a parity function's bits decreases. Unfortunately most EDAs including hBOA search for their models by looking for dependencies between pairs of variables (at least at first). We make suggestions on how EDAs could be adjusted to handle parity problems, but also comment on the apparently inevitable computational cost.

[1]  Masaharu Munetomo,et al.  Modeling Dependencies of Loci with String Classification According to Fitness Differences , 2004, GECCO.

[2]  H. Muhlenbein,et al.  The Factorized Distribution Algorithm for additively decomposed functions , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[3]  Alden H. Wright,et al.  An Estimation of Distribution Algorithm Based on Maximum Entropy , 2004, GECCO.

[4]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[5]  Kalyanmoy Deb,et al.  Analyzing Deception in Trap Functions , 1992, FOGA.

[6]  Martin Pelikan,et al.  Hierarchical Bayesian optimization algorithm: toward a new generation of evolutionary algorithms , 2010, SICE 2003 Annual Conference (IEEE Cat. No.03TH8734).

[7]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[8]  G. Harik Linkage Learning via Probabilistic Modeling in the ECGA , 1999 .

[9]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[10]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[11]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..

[12]  D. Goldberg,et al.  Domino convergence, drift, and the temporal-salience structure of problems , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[13]  Martin V. Butz,et al.  Hierarchical BOA on random decomposable problems , 2006, GECCO '06.

[14]  Robert E. Smith An iterative mutual information histogram technique for linkage learning in evolutionary algorithms , 2005, 2005 IEEE Congress on Evolutionary Computation.

[15]  David E. Goldberg,et al.  The compact genetic algorithm , 1999, IEEE Trans. Evol. Comput..