Leveling the Playing Field - Fairness in AI Versus Human Game Benchmarks

From the beginning of the history of AI, there has been interest in games as a platform of research. As the field developed, human-level competence in complex games became a target researchers worked to reach. Only relatively recently has this target been finally met for traditional tabletop games such as Backgammon, Chess and Go. This prompted a shift in research focus towards electronic games, which provide unique new challenges. As is often the case with AI research, these results are liable to be exaggerated or mis-represented by either authors or third parties. The extent to which these game benchmarks constitute "fair" competition between human and AI is also a matter of debate. In this paper, we review statements made by reseachers and third parties in the general media and academic publications about these game benchmark results. We analyze what a fair competition would look like and suggest a taxonomy of dimensions to frame the debate of fairness in game contests between humans and machines. Eventually, we argue that there is no completely fair way to compare human and AI performance on a game.

[1]  Julian Togelius,et al.  The 2010 Mario AI Championship: Level Generation Track , 2011, IEEE Transactions on Computational Intelligence and AI in Games.

[2]  Julian Togelius,et al.  Generative design in minecraft (GDMC): settlement generation competition , 2018, FDG.

[3]  Julian Togelius,et al.  Modifying MCTS for Human-Like General Video Game Playing , 2016, IJCAI.

[4]  Tom Schaul,et al.  StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[5]  Santiago Ontañón,et al.  A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[6]  Hiroaki Kitano,et al.  RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.

[7]  Michael Buro,et al.  Real-Time Strategy Games: A New AI Research Challenge , 2003, IJCAI.

[8]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[9]  A. Clark,et al.  The Extended Mind , 1998, Analysis.

[10]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[11]  Julian Togelius,et al.  Measuring Intelligence through Games , 2011, ArXiv.

[12]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[13]  Wojciech Jaskowski,et al.  ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[14]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[15]  Murray Campbell,et al.  Deep Blue , 2002, Artif. Intell..

[16]  J. Rawls,et al.  Justice as Fairness: A Restatement , 2001 .

[17]  Vivian Connell,et al.  The Chinese Room , 1942 .

[18]  汪伦 小议“be afraid” , 2002 .

[19]  S. Brosnan,et al.  Monkeys reject unequal pay , 2003, Nature.

[20]  G. Elisabeta Marai,et al.  The Chinese Room: Visualization and Interaction to Understand and Correct Ambiguous Machine Translation , 2009, Comput. Graph. Forum.

[21]  Julian Togelius,et al.  The Mario AI Benchmark and Competitions , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[22]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[23]  Julian Togelius,et al.  Artificial Intelligence and Games , 2018, Springer International Publishing.

[24]  Daniel Kahneman,et al.  Fairness and the Assumptions of Economics , 1986 .

[25]  Gary Marcus,et al.  Innateness, AlphaZero, and Artificial Intelligence , 2018, ArXiv.

[26]  Nathan Ensmenger,et al.  Is chess the drosophila of artificial intelligence? A social history of an algorithm , 2012, Social studies of science.

[27]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[28]  Marlos C. Machado,et al.  Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..

[29]  Demis Hassabis,et al.  Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[30]  David Barber,et al.  Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.

[31]  Philip Hingston,et al.  A Turing Test for Computer Game Bots , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[32]  Julian Togelius,et al.  General Video Game AI: Competition, Challenges and Opportunities , 2016, AAAI.

[33]  Jonathan Evans Heuristic and analytic processes in reasoning , 1984 .