Efficient Statistical Methods for Evaluating Trading Agent Performance

Market simulations, like their real-world counterparts, are typically domains of high complexity, high variability, and incomplete information. The performance of autonomous agents in these markets depends both upon the strategies of their opponents and on various market conditions, such as supply and demand. Because the space for possible strategies and market conditions is very large, empirical analysis in these domains becomes exceedingly difficult. Researchers who wish to evaluate their agents must run many test games across multiple opponent sets and market conditions to verify that agent performance has actually improved. Our approach is to improve the statistical power of market simulation experiments by controlling their complexity, thereby creating an environment more conducive to structured agent testing and analysis. We develop a tool that controls variability across games in one such market environment, the Trading Agent Competition for Supply Chain Management (TAC SCM), and demonstrate how it provides an efficient, systematic method for TAC SCM researchers to analyze agent performance.