Limitations of benchmark sets and landscape features for algorithm selection and performance prediction

Benchmark sets and landscape features are used to test algorithms and to train models to perform algorithm selection or configuration. These approaches are based on the assumption that algorithms have similar performances on problems with similar feature sets. In this paper, we test different configurations of differential evolution (DE) against the BBOB set. We then use the landscape features of those problems and a case base reasoning approach for DE configuration selection. We show that, although this method obtains good results for BBOB problems, it fails to select the best configurations when facing a new set of optimisation problems with a distinct array of landscape features. This demonstrates the limitations of the BBOB set for algorithm selection. Moreover, by examination of the relationship between features and algorithm performance, we show that there is no correlation between the feature space and the performance space. We conclude by identifying some important open questions raised by this work.