Multivariate Markov networks for fitness modelling in an estimation of distribution algorithm

A well-known paradigm for optimisation is the evolutionary algorithm (EA). An EA maintains a population of possible solutions to a problem which converges on a global optimum using biologically-inspired selection and reproduction operators. These algorithms have been shown to perform well on a variety of hard optimisation and search problems. A recent development in evolutionary computation is the Estimation of Distribution Algorithm (EDA) which replaces the traditional genetic reproduction operators (crossover and mutation) with the construction and sampling of a probabilistic model. While this can often represent a significant computational expense, the benefit is that the model contains explicit information about the fitness function. This thesis expands on recent work using a Markov network to model fitness in an EDA, resulting in what we call the Markov Fitness Model (MFM). The work has explored the theoretical foundations of the MFM approach which are grounded in Walsh analysis of fitness functions. This has allowed us to demonstrate a clear relationship between the fitness model and the underlying dynamics of the problem. A key achievement is that we have been able to show how the model can be used to predict fitness and have devised a measure of fitness modelling capability called the fitness prediction correlation (FPC). We have performed a series of experiments which use the FPC to investigate the effect of population size and selection operator on the fitness modelling capability. The results and analysis of these experiments are an important addition to other work on diversity and fitness distribution within populations. With this improved understanding of fitness modelling we have been able to extend the framework Distribution Estimation Using Markov networks (DEUM) to use a multivariate probabilistic model. We have proposed and demonstrated the performance of a number of algorithms based on this framework which lever the MFM for optimisation, which can now be added to the EA toolbox. As part of this we have investigated existing techniques for learning the structure of the MFM; a further contribution which results from this is the introduction of precision and recall as measures of structure quality. We have also proposed a number of possible directions that future work could take.

[1]  Sami Khuri,et al.  Walsh and Haar functions in genetic algorithms , 1994, SAC '94.

[2]  J. Ford,et al.  Hybrid estimation of distribution algorithm for global optimization , 2004 .

[3]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[4]  Robert B. Heckendorn,et al.  Predicting Epistasis Directly from Mathematical Models , 1999 .

[5]  David E. Goldberg,et al.  Evaluation relaxation using substructural information and linear estimation , 2006, GECCO '06.

[6]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[7]  John A. W. McCall,et al.  Markov Random Field Modelling of Royal Road Genetic Algorithms , 2001, Artificial Evolution.

[8]  Pedro Larrañaga,et al.  Combining Bayesian classifiers and estimation of distribution algorithms for optimization in continuous domains , 2007, Connect. Sci..

[9]  Qingfu Zhang,et al.  Combinations of estimation of distribution algorithms and other techniques , 2007, Int. J. Autom. Comput..

[10]  Cara MacNish Benchmarking Evolutionary Algorithms : The Huygens Suite , 2005 .

[11]  Martin Pelikan,et al.  Computational Complexity and Simulation of Rare Events of Ising Spin Glasses , 2004, GECCO.

[12]  Pablo Moscato,et al.  On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts : Towards Memetic Algorithms , 1989 .

[13]  Gary B. Fogel,et al.  Evolutionary Algorithms for Cancer Chemotherapy Optimization , 2007 .

[14]  M. Pelikán,et al.  The Bivariate Marginal Distribution Algorithm , 1999 .

[15]  David E. Goldberg,et al.  Fitness Inheritance In Multi-objective Optimization , 2002, GECCO.

[16]  Julie Cowie,et al.  Novel Genetic Algorithm Crossover Approaches for Time-Series Problems , 2007 .

[17]  Qingfu Zhang,et al.  A Hybrid Estimation of Distribution Algorithm for CDMA Cellular System Design , 2008, Int. J. Comput. Intell. Appl..

[18]  Heinz Mühlenbein,et al.  Predictive Models for the Breeder Genetic Algorithm I. Continuous Parameter Optimization , 1993, Evolutionary Computation.

[19]  Shumeet Baluja,et al.  Using Optimal Dependency-Trees for Combinational Optimization , 1997, ICML.

[20]  David E. Goldberg,et al.  Hierarchical Problem Solving and the Bayesian Optimization Algorithm , 2000, GECCO.

[21]  Alden H. Wright,et al.  Efficient Linkage Discovery by Limited Probing , 2003, Evolutionary Computation.

[22]  Haym Hirsh,et al.  Informed operators: Speeding up genetic-algorithm-based design optimization using reduced models , 2000, GECCO.

[23]  Pedro Larrañaga,et al.  Evolutionary computation based on Bayesian classifiers , 2004 .

[24]  L. A. Marascuilo,et al.  Nonparametric and Distribution-Free Methods for the Social Sciences , 1977 .

[25]  Julie Cowie,et al.  Maximising the efficiency of bio-control applications utilising genetic algorithms , 2007 .

[26]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[27]  Hitoshi Iba,et al.  Real-Coded Estimation of Distribution Algorithm , 2003 .

[28]  Martin Pelikan,et al.  Fitness Inheritance in the Bayesian Optimization Algorithm , 2004, GECCO.

[29]  Siddhartha Shakya,et al.  Solving the Ising Spin Glass Problem using a Bivariate EDA based on Markov Random Fields , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[30]  J. McCall,et al.  Incorporating a Metropolis method in a distribution estimation using Markov random field algorithm , 2005, 2005 IEEE Congress on Evolutionary Computation.

[31]  Derek Rowntree,et al.  Statistics without tears : a primer for non-mathematicians , 1982 .

[32]  Roberto Santana,et al.  Estimation of Distribution Algorithms with Kikuchi Approximations , 2005, Evolutionary Computation.

[33]  Siddhartha Shakya,et al.  An EDA based on local markov property and gibbs sampling , 2008, GECCO '08.

[34]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[35]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[36]  Heinz Mühlenbein,et al.  Evolutionary optimization using graphical models , 2009, New Generation Computing.

[37]  Colin R. Reeves,et al.  An Experimental Design Perspective on Genetic Algorithms , 1994, FOGA.

[38]  Siddhartha Shakya,et al.  Optimization by estimation of distribution with DEUM framework based on Markov random fields , 2007, Int. J. Autom. Comput..

[39]  Eyal Kushilevitz,et al.  Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[40]  Anne Auger,et al.  EEDA : A New Robust Estimation of Distribution Algorithms , 2004 .

[41]  Hisashi Handa Estimation of Distribution Algorithms with Mutation , 2005, EvoCOP.

[42]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[43]  Vojtech Franc,et al.  Estimation of fitness landscape contours in EAs , 2007, GECCO '07.

[44]  Thomas Bäck,et al.  A Survey of Evolution Strategies , 1991, ICGA.

[45]  Jonathan Timmis,et al.  Artificial immune systems - a new computational intelligence paradigm , 2002 .

[46]  C. Reeves,et al.  Properties of fitness functions and search landscapes , 2001 .

[47]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[48]  Heinz Mühlenbein,et al.  The Estimation of Distributions and the Minimum Relative Entropy Principle , 2005, Evol. Comput..

[49]  Matthew Brand,et al.  Incremental Singular Value Decomposition of Uncertain Data with Missing Values , 2002, ECCV.

[50]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[51]  Khaled Rasheed,et al.  Comparison Of Methods For Using Reduced Models To Speed Up Design Optimization , 2002, GECCO.

[52]  Kok Wai Wong,et al.  Surrogate-Assisted Evolutionary Optimization Frameworks for High-Fidelity Engineering Design Problems , 2005 .

[53]  Thomas Stützle,et al.  SATLIB: An Online Resource for Research on SAT , 2000 .

[54]  Martin Pelikan,et al.  Analyzing probabilistic models in hierarchical BOA on traps and spin glasses , 2007, GECCO '07.

[55]  David E. Goldberg,et al.  Bayesian Optimization Algorithm: From Single Level to Hierarchy , 2002 .

[56]  G. Harik Linkage Learning via Probabilistic Modeling in the ECGA , 1999 .

[57]  Gilbert Owusu,et al.  A fully multivariate DEUM algorithm , 2009, 2009 IEEE Congress on Evolutionary Computation.

[58]  Robert E. Smith,et al.  Fitness inheritance in genetic algorithms , 1995, SAC '95.

[59]  Petr Pos ´ ik Preventing Premature Convergence in a Simple EDA Via Global Step Size Setting , 2008 .

[60]  Siddhartha Shakya,et al.  A Markovianity based optimisation algorithm , 2012, Genetic Programming and Evolvable Machines.

[61]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[62]  Bernhard Sendhoff,et al.  Reducing Fitness Evaluations Using Clustering Techniques and Neural Network Ensembles , 2004, GECCO.

[63]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[64]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[65]  E. Thorndike On the Organization of Intellect. , 1921 .

[66]  P. Bosman,et al.  Continuous iterated density estimation evolutionary algorithms within the IDEA framework , 2000 .

[67]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[68]  Martin V. Butz,et al.  Substructural Neighborhoods for Local Search in the Bayesian Optimization Algorithm , 2006, PPSN.

[69]  Terry Jones,et al.  Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms , 1995, ICGA.

[70]  R. Santana,et al.  The mixture of trees Factorized Distribution Algorithm , 2001 .

[71]  Xin Yao,et al.  NichingEDA: Utilizing the diversity inside a population of EDAs for continuous optimization , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[72]  Hector J. Levesque,et al.  A New Method for Solving Hard Satisfiability Problems , 1992, AAAI.

[73]  Ryszard S. Michalski,et al.  LEARNABLE EVOLUTION MODEL: Evolutionary Processes Guided by Machine Learning , 2004, Machine Learning.

[74]  Michael D. Vose,et al.  The simple genetic algorithm - foundations and theory , 1999, Complex adaptive systems.

[75]  Pedro Larrañaga,et al.  GA-EDA: hybrid evolutionary algorithm using genetic and estimation of distribution algorithms , 2004 .

[76]  Thomas Jansen,et al.  On the analysis of the (1+1) evolutionary algorithm , 2002, Theor. Comput. Sci..

[77]  David E. Goldberg,et al.  Genetic Algorithms and Walsh Functions: Part I, A Gentle Introduction , 1989, Complex Syst..

[78]  Bernhard Sendhoff,et al.  Generalizing Surrogate-Assisted Evolutionary Computation , 2010, IEEE Transactions on Evolutionary Computation.

[79]  Michael W. Berry,et al.  SVDPACKC (Version 1.0) User''s Guide , 1993 .

[80]  Martin Pelikan,et al.  An application of a multivariate estimation of distribution algorithm to cancer chemotherapy , 2008, GECCO '08.

[81]  Siddhartha Shakya,et al.  DEUM : a framework for an estimation of distribution algorithm based on Markov random fields , 2006 .

[82]  Julie Cowie,et al.  Directed intervention crossover applied to bio-control scheduling , 2007, 2007 IEEE Congress on Evolutionary Computation.

[83]  Siddhartha Shakya,et al.  Using a Markov network model in a univariate EDA: an empirical cost-benefit analysis , 2005, GECCO '05.

[84]  David E. Goldberg,et al.  Combining competent crossover and mutation operators: a probabilistic model building approach , 2005, GECCO '05.

[85]  John A. W. McCall,et al.  Statistical optimisation and tuning of GA factors , 2005, 2005 IEEE Congress on Evolutionary Computation.

[86]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[87]  Dirk Thierens,et al.  Linkage Information Processing In Distribution Estimation Algorithms , 1999, GECCO.

[88]  John A. W. McCall,et al.  Bio-control in mushroom farming using a Markov network EDA , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[89]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[90]  John McCall,et al.  Estimating the distribution in an EDA , 2005 .

[91]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[92]  Qingfu Zhang,et al.  Structure learning and optimisation in a Markov-network based estimation of distribution algorithm , 2009, 2009 IEEE Congress on Evolutionary Computation.

[93]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[94]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[95]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[96]  Qingfu Zhang,et al.  Combination of Guided Local Search and Estimation of Distribution Algorithm for Quadratic Assignment Problems , 2006 .

[97]  David E. Goldberg,et al.  Real-Coded Bayesian Optimization Algorithm: Bringing the Strength of BOA into the Continuous World , 2004, GECCO.

[98]  John A. W. McCall,et al.  Solving the MAXSAT problem using a multivariate EDA based on Markov networks , 2007, GECCO '07.

[99]  Roberto Santana A Markov Network Based Factorized Distribution Algorithm for Optimization , 2003, ECML.

[100]  Albert Donally Bethke,et al.  Genetic Algorithms as Function Optimizers , 1980 .

[101]  Rachel Norman,et al.  Optimal application strategies for entomopathogenic nematodes: integrating theoretical and empirical approaches , 2002 .

[102]  Penousal Machado,et al.  The Art of Artificial Evolution: A Handbook on Evolutionary Art and Music , 2007 .

[103]  D. Goldberg,et al.  Don't evaluate, inherit , 2001 .

[104]  Arturo Hernández-Aguirre,et al.  Designing EDAs by using the elitist convergent EDA concept and the boltzmann distribution , 2008, GECCO 2008.

[105]  Qingfu Zhang,et al.  On stability of fixed points of limit models of univariate marginal distribution algorithm and factorized distribution algorithm , 2004, IEEE Transactions on Evolutionary Computation.

[106]  Qingfu Zhang,et al.  Iterated Local Search with Guided Mutation , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[107]  S. Baluja,et al.  Combining Multiple Optimization Runs with Optimal Dependency Trees , 1997 .

[108]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[109]  David E. Goldberg,et al.  Hierarchical BOA Solves Ising Spin Glasses and MAXSAT , 2003, GECCO.

[110]  Nikos E. Mastorakis,et al.  Advances in fuzzy systems and evolutionary computation , 2001 .

[111]  Dirk Thierens Estimating the significant non-linearities in the genome problem-coding , 1999 .

[112]  Yew-Soon Ong,et al.  A study on polynomial regression and Gaussian process global surrogate model in hierarchical surrogate-assisted evolutionary algorithm , 2005, 2005 IEEE Congress on Evolutionary Computation.

[113]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..

[114]  Daniel A. Ashlock,et al.  Small population effects and hybridization , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[115]  Alberto Ochoa,et al.  Linking Entropy to Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.

[116]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[117]  Xin Yao,et al.  Unified eigen analysis on multivariate Gaussian based estimation of distribution algorithms , 2008, Inf. Sci..

[118]  David E. Goldberg,et al.  The compact genetic algorithm , 1999, IEEE Trans. Evol. Comput..

[119]  Jürgen Branke,et al.  Addressing sampling errors and diversity loss in UMDA , 2007, GECCO '07.

[120]  Sébastien Vérel,et al.  Local Search Heuristics: Fitness Cloud versus Fitness Landscape , 2004, ECAI.

[121]  M Dorigo,et al.  Ant colonies for the travelling salesman problem. , 1997, Bio Systems.

[122]  Yaochu Jin,et al.  A comprehensive survey of fitness approximation in evolutionary computation , 2005, Soft Comput..

[123]  David E. Goldberg,et al.  Substructural Surrogates for Learning Decomposable Classification Problems , 2008, IWLCS.

[124]  Heinz Mühlenbein,et al.  Optimal Mutation Rate Using Bayesian Priors for Estimation of Distribution Algorithms , 2001, SAGA.

[125]  Marc Schoenauer,et al.  Surrogate Deterministic Mutation: Preliminary Results , 2001, Artificial Evolution.

[126]  Qingfu Zhang,et al.  Approaches to selection and their effect on fitness modelling in an Estimation of Distribution Algorithm , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[127]  John McCall,et al.  Updating the probability vector using MRF technique for a Univariate EDA , 2004 .

[128]  Qingfu Zhang,et al.  An evolutionary algorithm with guided mutation for the maximum clique problem , 2005, IEEE Transactions on Evolutionary Computation.

[129]  Keiki Takadama,et al.  Maintaining Multiple Populations with Different Diversities for Evolutionary Optimization Based on Probability Models , 2008 .

[130]  Hans-Paul Schwefel,et al.  Numerical Optimization of Computer Models , 1982 .

[131]  Heinz Mühlenbein,et al.  Schemata, Distributions and Graphical Models in Evolutionary Optimization , 1999, J. Heuristics.

[132]  David E. Goldberg,et al.  Hierarchical Bayesian Optimization Algorithm , 2006, Scalable Optimization via Probabilistic Modeling.

[133]  David E. Goldberg,et al.  iBOA: the incremental bayesian optimization algorithm , 2008, GECCO '08.

[134]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[135]  Hod Lipson,et al.  Coevolution of Fitness Predictors , 2008, IEEE Transactions on Evolutionary Computation.

[136]  Alden H. Wright,et al.  On the convergence of an estimation of distribution algorithm based on linkage discovery and factorization , 2005, GECCO '05.

[137]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .