The Genetic Programming Search Space

We shall investigate how the space of all possible programs, i e the space which genetic programming searches, scales. Particularly how it changes with respect to the size of programs We will show, in general, that above some problem dependent threshold, considering all programs, their fitness shows little variation with their size. The distribution of fitness levels, particularly the distribution of solutions, gives us directly the performance of random search. We can use this as a benchmark against which to compare GP and other techniques. These results are demonstrated in this chapter using a combination of enumeration and Monte Carlo sampling. For the interested reader, formal proofs are given in Chapter 8. Informal; arguments are presented in Section 7.5 to extend our results to modular GP, memory and Turing complete GP. Section 7.6 considers the relationship between tree size and depth for various types of program. We finish this chapter with a discussion of these results and their implications.