The Effects of Bounding Rationality on the Performance and Learning of CHREST Agents in Tileworld

Learning in complex and complicated domains is fundamental to performing suitable and timely actions within them. The ability of chess masters to learn and recall huge numbers of board configurations to produce near-optimal actions provides evidence that chunking mechanisms are likely to underpin human learning. Cognitive theories based on chunking argue in favour for the notion of bounded rationality since relatively small chunks of information are learnt in comparison to the total information present in the environment. CHREST, a computational architecture that implements chunking theory, has previously been used to investigate learning in deterministic environments such as chess, where future states are solely dependent upon the actions of agents. In this paper, the CHREST architecture is implemented in agents situated in “Tileworld”, a stochastic environment whose future state depends on both the actions of agents and factors intrinsic to the environment which agents have no control over. The effects of bounding agents’ visual input on learning and performance in various scenarios where the complexity of Tileworld is altered is analysed using computer simulations. Our results show that interactions between independent variables are complex and have important implications for agents situated in stochastic environments where a balance must be struck between learning and performance.

[1]  Gerardo I. Simari On Approximating the Best Decision for an Autonomous Agent , 2004 .

[2]  Herbert A. Simon,et al.  Five seconds or sixty? Presentation time in expert memory , 2000 .

[3]  Michael Wooldridge,et al.  Intention Reconsideration Reconsidered , 1998, ATAL.

[4]  John E. Laird,et al.  The Soar Cognitive Architecture , 2012 .

[5]  John R Anderson,et al.  An integrated theory of the mind. , 2004, Psychological review.

[6]  Fernand Gobet,et al.  Linking working memory and long-term memory: a computational model of the learning of new words. , 2007, Developmental science.

[7]  H. Simon,et al.  A simulation of memory for chess positions. , 1973 .

[8]  Fernand Gobet,et al.  Simulating the Referential Properties of Dutch, German, and English Root Infinitives in MOSAIC , 2009 .

[9]  de Cornelis Glopper,et al.  Het oog van de meester , 1990 .

[10]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[11]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[12]  Fernand Gobet,et al.  CHREST Models of Implicit Learning and Board Game Interpretation , 2012, AGI.

[13]  Martha E. Pollack,et al.  Introducing the Tileworld: Experimentally Evaluating Agent Architectures , 1990, AAAI.

[14]  Fernand Gobet,et al.  Perception and memory in chess: Heuristics of the professional eye , 1996 .

[15]  H. Simon,et al.  Perception in chess , 1973 .

[16]  H. Simon,et al.  A Behavioral Model of Rational Choice , 1955 .

[17]  C. Shannon A chess-playing machine. , 1950, Scientific American.

[18]  Fernand Gobet,et al.  Neuro-cognitive model of move location in the game of Go , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[19]  J. Pine,et al.  Chunking mechanisms in human learning , 2001, Trends in Cognitive Sciences.