An Artificial Economy of Post Production Systems

We study the problem of how a computer program can learn, by interacting with an environment, to return an algorithm for solving a class of problems. The two example domains studied in this paper are Blocks World stacking problems and Rubik's Cube. Our approach is to simulate the evolution of an artificial economy of computer programs called "agents". Simple rules imposed on the economy result in credit assignment, factoring the problem of evolving an overall program for the class of problems into simpler problems of evolving agents that specialize on aspects of the problem and collaborate to solve the overall class. In this paper our agents are Post Production Systems. Our system, called Hayek4, has learned from random examples a program that solves arbitrary block stacking problems. The program essentially consists of about 5 learned rules and some learned control information. Solution of an instance with n blocks in its goal stack requires the automatic chaining of the rules in correct sequence about 2n deep. Hayek4 has also learned to correct Rubik's cubes scrambled with up to about 7 random rotations. These results can also be seen in the automatic theorem proving context as a way to learn domain knowledge allowing one to automatically generate compact proofs.