Production-based Cognitive Models as a Test Suite for Reinforcement Learning Algorithms

We introduce a framework in which production-rule based computational cognitive modeling and Reinforcement Learning can systematically interact and inform each other. We focus on linguistic applications because the sophisticated rule-based cognitive models needed to capture linguistic behavioral data promise to provide a stringent test suite for RL algorithms, connecting RL algorithms to both accuracy and reaction-time experimental data. Thus, we open a path towards assembling an experimentally rigorous and cognitively realistic benchmark for RL algorithms. We extend our previous work on lexical decision tasks and tabular RL algorithms (Brasoveanu and Dotlacil, 2020b) with a discussion of neural-network based approaches, and a discussion of how parsing can be formalized as an RL problem.

[1]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Adrian Brasoveanu,et al.  Reinforcement Learning for Production-Based Cognitive Models , 2021, Top. Cogn. Sci..

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[6]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[7]  Shimon Whiteson,et al.  A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[8]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[9]  Adrian Brasoveanu An extensible framework for mechanistic processing models: From representational linguistic theories to quantitative model comparison , 2018 .

[10]  Richard L. Lewis,et al.  An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval , 2005, Cogn. Sci..

[11]  John T. Hale,et al.  What a Rational Parser Would Do , 2011, Cogn. Sci..

[12]  John R. Anderson,et al.  From recurrent choice to skill learning: a reinforcement-learning model. , 2006, Journal of experimental psychology. General.

[13]  John R. Anderson How Can the Human Mind Occur in the Physical Universe , 2007 .

[14]  John R. Anderson,et al.  Why do children learn to say “Broke”? A model of learning the past tense without feedback , 2002, Cognition.

[15]  Adrian Brasoveanu,et al.  Computational Cognitive Modeling and Linguistic Theory , 2020 .

[16]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[17]  Philip Resnik,et al.  Left-Corner Parsing and Psychological Plausibility , 1992, COLING.

[18]  C. Lebiere,et al.  The Atomic Components of Thought , 1998 .