Evaluating a Reinforcement Learning Algorithm with a General Intelligence Test

In this paper we apply the recent notion of anytime universal intelligence tests to the evaluation of a popular reinforcement learning algorithm, Q-learning. We show that a general approach to intelligence evaluation of AI algorithms is feasible. This top-down (theory-derived) approach is based on a generation of environments under a Solomonoff universal distribution instead of using a pre-defined set of specific tasks, such as mazes, problem repositories, etc. This first application of a general intelligence test to a reinforcement learning algorithm brings us to the issue of task-specific vs. general AI agents. This, in turn, suggests new avenues for AI agent evaluation and AI competitions, and also conveys some further insights about the performance of specific algorithms.

[1]  David L. Dowe,et al.  A Non-Behavioural, Computational Extension to the Turing Test , 1998 .

[2]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[3]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[4]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[5]  Paul M. B. Vitányi,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1993, Graduate Texts in Computer Science.

[6]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[7]  Itamar Arel,et al.  Beyond the Turing Test , 2009, Computer.

[8]  Anthony J. Bagnall,et al.  Learning Mazes with Aliasing States: An LCS Algorithm with Associative Perception , 2009, Adapt. Behav..

[9]  José Hernández-Orallo A (hopefully) Unbiased Universal Environment Class for Measuring Intelligence of Biological and Artificial Systems , 2009, AGI 2010.

[10]  José Hernández-Orallo,et al.  On Evaluating Agent Performance in a Fixed Period of Time (Extended Version) , 2010, AGI 2010.

[11]  David L. Dowe,et al.  A computer program capable of passing I.Q. tests , 2008 .

[12]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[13]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[14]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[15]  Jacques Ferber,et al.  Environments for Multiagent Systems State-of-the-Art and Research Challenges , 2004, E4MAS.

[16]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[17]  Joel Veness,et al.  Reinforcement Learning via AIXI Approximation , 2010, AAAI.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  Shane Legg,et al.  Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.

[20]  José Hernández-Orallo,et al.  Measuring universal intelligence: Towards an anytime intelligence test , 2010, Artif. Intell..

[21]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[22]  Shane Legg,et al.  A Universal Measure of Intelligence for Artificial Agents , 2005, IJCAI.

[23]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[24]  Shimon Whiteson,et al.  The Reinforcement Learning Competitions , 2010 .