A Population-Differential Method of Monitoring Success and Failure in Coevolution

Coevolutionary algorithms require no domain-specific mea- sure of objective fitness, enabling these algorithms to be applied to do- mains for which no objective metric is known or for which known met- rics are too expensive. But this flexibility comes at the expense of ac- countability. Past work on monitoring has focused on measuring success, but has ignored failure. This limitation is due to a common reliance on "best-of-generation" (BOG) based analysis (1), and we propose a population-differential analysis based on an alternate "all-of-generation" (AOG) framework that is not similarly limited. Coevolutionary analysis based on generation tables was introduced by Cliff and Miller as CIAO data (2). In dual-population coevolution, the table's rows are assigned to the first population's generations, and columns to the second popu- lation. Internal entries contain a best-vs-best evaluation of the intersecting gen- erations. This BOG approach appears particularly problematic for two reasons. First, analysis varies depending on the definition of "best" (within a popula- tion), but this definition has become arbitrarily fixed on the Last Elite Opponent criterion (3), while alternate definitions are equally viable. The coevolutionary algorithm under examination may itself define "best" differently (e.g. Pareto co- evolution as "on the Pareto front") in which case LEO is inappropriate. Second, while BOG-based analysis may give useful insight into algorithmic dynamics of successful individuals (i.e. the "best",) it provides little about the population as a whole (i.e. the "rest",) and is therefore blind to many failures. For an "all-of-generation" alternative, rather than identifying the "best" member of both populations and recording the outcome of their interaction, AOG records the outcome of all interactions between every pairs of individuals from the two populations, respectively. In the data provided below, we imple- ment this population-grained evaluation PEv al as an averaging of all individual evaluations (each of which is either win, tie ,o rlose, which is denoted numerically as 1, 0, and -1, respectively. Next we construct the population-differential analy- sis measure, based on the insight that the progression of candidate generations ought to perform better over time with respect to a fixed test generation (and vice versa) if successful. First we define a single distinction with the population comparators (between current generation i and oldest generation in memory, j). We then collect all available such comparisons at each (where o is the oldest known generation) with the candidate and test performance metrics.