Challenges in coevolutionary learning: arms-race dynamics, open-endedness, and medicocre stable states

Coevolution has been proposed as a way to evolve a learner and a learning environment simultaneously such that open-ended progress arises naturally, via a competitive arms race, with minimal inductive bias. Nevertheless, the conditions necessary to initiate and sustain arms-race dynamics are not well understood; mediocre stable states frequently result from learning through self-play (Angeline & Pollack 1994), while analysis usually requires closed domains with known optima, like sorting-networks (Hillis 1991). While intuitions regarding what enables successful coevolution abound, none have been methodically tested. We present a game that a ords such methodical investigation. A population of deterministic string generators is coevolved with two populations of string predictors, one \friendly" and one \hostile"; generators are rewarded to behave in a manner that is simultaneously predictable to the friendly predictors and unpredictable to the hostile predictors. This game design allows us to employ information theory to provide rigorous characterizations of agent behavior and coevolutionary progress. Further, we can craft agents of known ability and environments of known di culty, and thus precisely frame questions regarding learnability. Our results show that subtle changes to the game determine whether it is open-ended, and profoundly a ect the existence and nature of an arms race.