Crossover Bias in Genetic Programming

Path length, or search complexity, is an understudied property of trees in genetic programming. Unlike size and depth measures, path length directly measures the balancedness or skewedness of a tree. Here a close relative to path length, called visitation length, is studied. It is shown that a population undergoing standard crossover will introduce a crossover bias in the visitation length. This bias is due to inserting variable length subtrees at various levels of the tree. The crossover bias takes the form of a covariance between the sizes and levels in the trees that form a population. It is conjectured that the crossover bias directly determines the size distribution of trees in genetic programming. Theorems are presented for the one-generation evolution of visitation length both with and without selection. The connection between path length and visitation length is made explicit.