Introduction to genetic programming

Genetic programming (GP) is a collection of evolutionary computation techniques that allow computers to solve problems automatically. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. Since its inception twenty years ago, GP has been used to solve a wide range of practical problems, producing a number of human-competitive results and even patentable new inventions. Like many other areas of computer science, GP is evolving rapidly, with new ideas, techniques and applications being constantly proposed, making it is difficult for people to identify the key ideas in the field and keep up with the pace of new developments. The aim of this tutorial is to provide a roadmap to GP for both newcomers and old-timers. The tutorial starts with a gentle introduction which describes how a population of programs is stored in the computer so that they can evolve with time. We explain how programs are represented, how random programs are initially created, and how GP creates a new generation by mutating the better existing programs or combining pairs of good parent programs to produce offspring programs. Then, we describe a variety of alternative representations for programs and some advanced GP techniques. These include: the evolution of machine-code and parallel programs, the use of grammars and probability distributions for the generation of programs, variants of GP which allow the solution of problems with multiple objectives, many speed-up techniques and some useful theoretical tools. To illustrate genetic programming's scope, we then review some real-world applications of GP. The tutorial is concluded by a series of recommendations and suggestions to obtain the most from a GP system and open questions.