NK Landscape Instances Mimicking the Protein Inverse Folding Problem Towards Future Benchmarks

This paper introduces two new nominal NK Landscape model instances designed to mimic the properties of one challenging optimisation problem from biology: the Inverse Folding Problem (IFP), here focusing on a simpler secondary structure version. Through landscape analysis tests, numerous problem properties are identified and used to parameterise and validate model instances in terms of epistatic links, adaptive- and random walk characteristics. Then the performance of different Genetic Algorithms (GAs) is compared on both the new NK Models and the original IFP, in terms of population diversity, solution quality and convergence characteristics. It is demonstrated that very similar properties are captured in all presented tests with a significantly faster evaluation time compared to the real IFP. The future purpose of such a model is to provide a generic benchmark for algorithms targeting protein sequence optimisation, specifically in protein design. It may also provide the foundation for more in-depth studies of the size, shape and characteristics of the solution space of good solutions to the IFP.