Efficient Evolution of Machine Code for CISC Architectures using Blocks and Homologous Crossover

This chapter describes recent advances in genetic programming of machine code. Evolutionary program induction of binary machine code is one of the fastest GP methods and the most well studied linear approach. The technique has previously been known as Compiling Genetic Programming System (CGPS) but to avoid confusion with methods using an actual compiler and to separate the system from the method, the name has been changed to Automatic Induction of Machine Code with Genetic Programming (AIM-GP). AIM-GP stores individuals as a linear string of native binary machine code, which is directly executed by the processor. The absence of an interpreter and complex memory handling allows increased speed of several orders of magnitudes. AIM-GP has so far been applied to processors with a fixed instruction length (RISC) using integer arithmetics. This chapter describes several new advances to the AIM-GP method which are important for the applicability of the technique. Such advances include enabling the induction of code for CISC processors such as the most widespread computer architecture INTEL x86 as well as JAVA and many embedded processors. The new technique also makes AIM-GP more portable in general and simplifies the adaptation to any processor architecture. Other additions include the use of floating point instructions, control flow instructions, ADFs and new genetic operators e.g. aligned homologous crossover. We also discuss the benefits and drawbacks of register machine GP versus tree-based GP. This chapter is meant to be a directed towards the practitioner, who wants to extend AIM-GP to new architectures and application domains.