Walsh analysis, epistasis, and optimization problem difficulty for evolutionary algorithms

The epistasis of a function is the bitwise nonlinearity of a function whose domain is the set of bit strings of length L. Epistasis is related to problem difficulty for evolutionary algorithms. Walsh analysis can be used to quantify epistasis. In this dissertation Walsh analysis is developed in detail starting with the definitions for the Walsh transform. The epistasis of a polynomial is shown to be bounded by the degree of the polynomial and the epistasis of logical expressions by the number of variables involved. The effects on epistasis of problem representation operators such as: bit extraction, scaling, translating, and Gray code are also studied. The epistasis of functions composed of subfunctions combined by various operators is predicted. I show that functions can display odd and even parity and prove several theorems on the invariance of these properties. An application of these theorems demonstrates that picking the proper problem representation reduces epistasis, often making the problem easier to solve. Several new measures of epistasis are created including: Walsh sums, Walsh counts, function order, and coverage. New and useful mathematical tools for Walsh analysis are presented such as pack, unpack, and spectral functions and hyperplane numbering. The concept of embedded landscapes is introduced. An embedded landscape is a sum of a set of subfunctions each of which takes as its argument a subset of bits from the domain of the landscape. Embedded landscapes with bounded subfunction domains are shown to have epistasis limited to that of the subfunctions. Both MAXkSAT and NK-landscapes are shown to be embedded landscapes. I prove that all the Walsh coefficients of an embedded landscape. composed of subfunctions of bounded domain, can be computed in polynomial time. Summary statistics (variance, skew, etc.) can also be computed in polynomial time. I conclude that even though all Walsh coefficients of a function can be known in polynomial time, the function can still be NP-complete.