Selecting a diverse set of benchmark instances from a tunable model problem for black-box discrete optimization algorithms

Abstract As the number of practical applications of discrete black-box metaheuristics is growing faster and faster, the benchmarking of these algorithms is rapidly gaining importance. While new algorithms are often introduced for specific problem domains, researchers are also interested in which general problem characteristics are hard for which type of algorithm. The W-Model is a benchmark function for discrete black-box optimization, which allows for the easy, fast, and reproducible generation of problem instances exhibiting characteristics such as ruggedness, deceptiveness, epistasis, and neutrality in a tunable way. We conduct the first large-scale study with the W-Model in its fixed-length single-objective form, investigating 17 algorithm configurations (including Evolutionary Algorithms and local searches) and 8372 problem instances. We develop and apply a machine learning methodology to automatically discover several clusters of optimization process runtime behaviors as well as their reasons grounded in the algorithm and model parameters. Both a detailed statistical evaluation and our methodology confirm that the different model parameters allow us to generate problem instances of different hardness, but also find that the investigated algorithms struggle with different problem characteristics. With our methodology, we select a set of 19 diverse problem instances with which researchers can conduct a fast but still in-depth analysis of algorithm performance. The best-performing algorithms in our experiment were Evolutionary Algorithms applying Frequency Fitness Assignment, which turned out to be robust over a wide range of problem settings and solved more instances than the other tested algorithms.

[1]  Jesús Giráldez-Cru,et al.  A Modularity-Based Random SAT Instances Generator , 2015, IJCAI.

[2]  Christian M. Reidys,et al.  Neutrality in fitness landscapes , 2001, Appl. Math. Comput..

[3]  E D Weinberger,et al.  Why some fitness landscapes are fractal. , 1993, Journal of theoretical biology.

[4]  Yoav Shoham,et al.  Learning the Empirical Hardness of Optimization Problems: The Case of Combinatorial Auctions , 2002, CP.

[5]  Raymond Chiong,et al.  Why Is Optimization Difficult? , 2009, Nature-Inspired Algorithms for Optimisation.

[6]  Steven Skiena,et al.  The Algorithm Design Manual , 2020, Texts in Computer Science.

[7]  L. Darrell Whitley,et al.  Optimizing one million variable NK landscapes by hybridizing deterministic recombination and local search , 2017, GECCO.

[8]  Gunar E. Liepins,et al.  Deceptiveness and Genetic Algorithm Dynamics , 1990, FOGA.

[9]  Bin Li,et al.  Automatically discovering clusters of algorithm and problem instance behaviors as well as their causes from experimental data, algorithm setups, and instance features , 2018, Appl. Soft Comput..

[10]  Christian Hennig,et al.  Methods for merging Gaussian mixture components , 2010, Adv. Data Anal. Classif..

[11]  Kurt Geihs,et al.  A tunable model for multi-objective, epistatic, rugged, and neutral fitness landscapes , 2008, GECCO '08.

[12]  Holger H. Hoos,et al.  UBCSAT: An Implementation and Experimentation Environment for SLS Algorithms for SAT & MAX-SAT , 2004, SAT.

[13]  Kevin Leyton-Brown,et al.  Identifying Key Algorithm Parameters and Instance Features Using Forward Selection , 2013, LION.

[14]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[15]  Nikolaus Hansen,et al.  Invariance, Self-Adaptation and Correlated Mutations and Evolution Strategies , 2000, PPSN.

[16]  ZhengYu-Jun,et al.  Evolutionary optimization for disaster relief operations , 2015 .

[17]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[18]  Xiaodong Li,et al.  Benchmark Functions for the CEC'2010 Special Session and Competition on Large-Scale , 2009 .

[19]  L. Barnett Ruggedness and neutrality—the NKp family of fitness landscapes , 1998 .

[20]  Holger H. Hoos,et al.  Analysing differences between algorithm configurations through ablation , 2015, Journal of Heuristics.

[21]  Hassan Ismkhan,et al.  Black box optimization using evolutionary algorithm with novel selection and replacement strategies based on similarity between solutions , 2018, Appl. Soft Comput..

[22]  Bernhard Sendhoff,et al.  A systems approach to evolutionary multiobjective structural optimization and beyond , 2009, IEEE Computational Intelligence Magazine.

[23]  Dirk Thierens,et al.  Convergence Models of Genetic Algorithm Selection Schemes , 1994, PPSN.

[24]  Xin Yao,et al.  Evolving exact integer algorithms with Genetic Programming , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[25]  Sébastien Vérel,et al.  On the structure of multiobjective combinatorial search space: MNK-landscapes with correlated objectives , 2013, Eur. J. Oper. Res..

[26]  Zbigniew Michalewicz,et al.  Benchmarking Optimization Algorithms: An Open Source Framework for the Traveling Salesman Problem , 2014, IEEE Computational Intelligence Magazine.

[27]  Yuval Davidor,et al.  Epistasis Variance: A Viewpoint on GA-Hardness , 1990, FOGA.

[28]  Jano I. van Hemert,et al.  Discovering the suitability of optimisation algorithms by learning from evolved instances , 2011, Annals of Mathematics and Artificial Intelligence.

[29]  Ashutosh Tiwari,et al.  A review of soft computing applications in supply chain management , 2010, Appl. Soft Comput..

[30]  Matteo Fischetti,et al.  Algorithms for the Set Covering Problem , 2000, Ann. Oper. Res..

[31]  Jonathan E. Rowe,et al.  Landscape Analysis of a Class of NP-Hard Binary Packing Problems , 2019, Evolutionary Computation.

[32]  Yong Gao,et al.  An Analysis of Phase Transition in NK Landscapes , 2002, J. Artif. Intell. Res..

[33]  S. Kauffman,et al.  Towards a general theory of adaptive walks on rugged landscapes. , 1987, Journal of theoretical biology.

[34]  Melanie Mitchell,et al.  The royal road for genetic algorithms: Fitness landscapes and GA performance , 1991 .

[35]  Felip Manyà,et al.  MaxSAT, Hard and Soft Constraints , 2021, Handbook of Satisfiability.

[36]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[37]  R. Bellman Dynamic programming. , 1957, Science.

[38]  David E. Goldberg,et al.  Linkage Identification by Non-monotonicity Detection for Overlapping Functions , 1999, Evolutionary Computation.

[39]  Thomas Weise,et al.  Evolving Distributed Algorithms With Genetic Programming , 2012, IEEE Transactions on Evolutionary Computation.

[40]  Thomas Stützle,et al.  SATLIB: An Online Resource for Research on SAT , 2000 .

[41]  Yan Chen,et al.  Frequency Fitness Assignment: Making Optimization Algorithms Invariant Under Bijective Transformations of the Objective Function Value , 2020, IEEE Transactions on Evolutionary Computation.

[42]  Raymond Ros,et al.  Real-Parameter Black-Box Optimization Benchmarking 2009: Experimental Setup , 2009 .

[43]  Kikuo Fujita,et al.  MULTI-OBJECTIVE OPTIMAL DESIGN OF AUTOMOTIVE ENGINE USING GENETIC ALGORITHM , 1998 .

[44]  Marco Dorigo,et al.  Evolving a cooperative transport behavior for two simple robots , 2004 .

[45]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[46]  Janet Wiles,et al.  Maximally rugged NK landscapes contain the highest peaks , 2005, GECCO '05.

[47]  T. Grandon Gill,et al.  Reflections on Researching the Rugged Fitness Landscape , 2008, Informing Sci. Int. J. an Emerg. Transdiscipl..

[48]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[49]  Kiyoshi Tanaka,et al.  Working principles, behavior, and performance of MOEAs on MNK-landscapes , 2007, Eur. J. Oper. Res..

[50]  Maria Luisa Bonet,et al.  Analysis and Generation of Pseudo-Industrial MaxSAT Instances , 2012, CCIA.

[51]  L. Peliti,et al.  Population dynamics in a spin-glass model of chemical evolution , 1989, Journal of Molecular Evolution.

[52]  Carsten Witt,et al.  Optimal Mutation Rates for the (1+$$\lambda $$λ) EA on OneMax Through Asymptotically Tight Drift Analysis , 2017, Algorithmica.

[53]  Chao Qian,et al.  Running Time Analysis of the (1+1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1+1$$\end{document})-EA for OneMax an , 2017, Algorithmica.

[54]  Miguel Cárdenas-Montes,et al.  Creating hard-to-solve instances of travelling salesman problem , 2018, Appl. Soft Comput..

[55]  Panos M. Pardalos,et al.  Experimental Analysis of Approximation Algorithms for the Vertex Cover and Set Covering Problems , 2006, Comput. Oper. Res..

[56]  Carola Doerr,et al.  Maximizing Drift Is Not Optimal for Solving OneMax , 2019, Evolutionary Computation.

[57]  Zbigniew Michalewicz,et al.  Evolutionary computation for multicomponent problems: opportunities and future directions , 2016, Optimization in Industry.

[58]  Sébastien Vérel,et al.  From Royal Road to Epistatic Road for Variable Length Evolution Algorithm , 2003, Artificial Evolution.

[59]  H. Beyer An alternative explanation for the manner in which genetic algorithms operate. , 1997, Bio Systems.

[60]  Kazuhiro Ohkura,et al.  Estimating the Degree of Neutrality in Fitness Landscapes by the Nei’s Standard Genetic Distance – An Application to Evolutionary Robotics – , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[61]  Reinaldo Morabito,et al.  A Memetic Framework for Solving the Lot Sizing and Scheduling Problem in Soft Drink Plants , 2012, Variants of Evolutionary Algorithms for Real-World Applications.

[62]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[63]  Richard A. Watson,et al.  On the Utility of Redundant Encodings in Mutation-Based Evolutionary Search , 2002, PPSN.

[64]  Thomas Stützle,et al.  Evaluating Las Vegas Algorithms: Pitfalls and Remedies , 1998, UAI.

[65]  Hao Wang,et al.  IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuristics , 2018, ArXiv.

[66]  E. Weinberger NP Completeness of Kauffman's N-k Model, A Tuneable Rugged Fitness Landscape , 1996 .

[67]  Yu-Jun Zheng,et al.  Evolutionary optimization for disaster relief operations: A survey , 2015, Appl. Soft Comput..

[68]  Kate Smith-Miles,et al.  Towards objective measures of algorithm performance across instance space , 2014, Comput. Oper. Res..

[69]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .

[70]  Pietro Simone Oliveto,et al.  Simplified Drift Analysis for Proving Lower Bounds in Evolutionary Computation , 2008, Algorithmica.

[71]  E. D. Weinberger,et al.  The NK model of rugged fitness landscapes and its application to maturation of the immune response. , 1989, Journal of theoretical biology.

[72]  Sébastien Vérel,et al.  Deceptiveness and neutrality the ND family of fitness landscapes , 2006, GECCO.

[73]  N Geard,et al.  An exploration of NK landscapes with neutrality , 2001 .

[74]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[75]  Ponnuthurai Nagaratnam Suganthan,et al.  Problem Definitions and Evaluation Criteria for CEC 2015 Special Session on Bound Constrained Single-Objective Computationally Expensive Numerical Optimization , 2015 .

[76]  Carola Doerr,et al.  OneMax in Black-Box Models with Several Restrictions , 2015, Algorithmica.

[77]  Michalis Vazirgiannis,et al.  A density-based cluster validity approach using multi-representatives , 2008, Pattern Recognit. Lett..

[78]  Nikolaus Hansen,et al.  Benchmarking of Continuous Black Box Optimization Algorithms , 2012, Evolutionary Computation.

[79]  Thomas Weise,et al.  Difficult features of combinatorial optimization problems and the tunable w-model benchmark problem for simulating them , 2018, GECCO.

[80]  Jano I. van Hemert,et al.  Understanding TSP Difficulty by Learning from Evolved Instances , 2010, LION.

[81]  Kyomin Jung,et al.  Phase transition in a random NK landscape model , 2008, Artif. Intell..

[82]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[83]  Thomas Jansen,et al.  Design and Management of Complex Technical Processes and Systems by means of Computational Intelligence Methods Evolutionary Algorithms-How to Cope With Plateaus of Constant Fitness and When to Reject Strings of the Same Fitness , 2001 .

[84]  Luca Scrucca,et al.  mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models , 2016, R J..

[85]  Tobias Dantzig,et al.  Number : the language of science : a critical survey written for the cultured non-mathematician , 1939 .

[86]  Xin Yao,et al.  Frequency Fitness Assignment , 2014, IEEE Transactions on Evolutionary Computation.

[87]  Frank W. Ciarallo,et al.  Multiobjectivization via Helper-Objectives With the Tunable Objectives Problem , 2012, IEEE Transactions on Evolutionary Computation.

[88]  Alden H. Wright,et al.  The computational complexity of N-K fitness functions , 2000, IEEE Trans. Evol. Comput..

[89]  Ezra Wari,et al.  A survey on metaheuristics for optimization in food manufacturing industry , 2016, Appl. Soft Comput..

[90]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[91]  Janet Wiles,et al.  A comparison of neutral landscapes - NK, NKp and NKq , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[92]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[93]  Raymond Chiong,et al.  Evolutionary Optimization: Pitfalls and Booby Traps , 2012, Journal of Computer Science and Technology.

[94]  Kurt Geihs,et al.  Rule-based Genetic Programming , 2007, 2007 2nd Bio-Inspired Models of Network, Information and Computing Systems.

[95]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[96]  Zuomin Dong,et al.  Hybrid surrogate-based optimization using space reduction (HSOSR) for expensive black-box functions , 2018, Appl. Soft Comput..