Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes

In most problem-solving activities, feedback is received at the end of an action sequence. This creates a credit-assignment problem where the learner must associate the feedback with earlier actions, and the interdependencies of actions require the learner to remember past choices of actions. In two studies, we investigated the nature of explicit and implicit learning processes in the credit-assignment problem using a probabilistic sequential choice task with and without a secondary memory task. We found that when explicit learning was dominant, learning was faster to select the better option in their first choices than in the last choices. When implicit reinforcement learning was dominant, learning was faster to select the better option in their last choices than in their first choices. Consistent with the probability-learning and sequence-learning literature, the results show that credit assignment involves two processes: an explicit memory encoding process that requires memory rehearsals and an implicit reinforcement-learning process that propagates credits backwards to previous choices.

[1]  Lewis B. Ward Reminiscence and rote learning. , 1937 .

[2]  M E Bitterman,et al.  Probability Learning. , 1962, Science.

[3]  A. W. Melton Categories of Human Learning , 1964 .

[4]  Richard C. Atkinson,et al.  Studies in mathematical psychology , 1964 .

[5]  J. Yellott Probability learning with noncontingent success , 1969 .

[6]  B. Murdock,et al.  The role of auditory features in memory span for words. , 1980, Journal of experimental psychology. Human learning and memory.

[7]  M. Nissen,et al.  Attentional requirements of learning: Evidence from performance measures , 1987, Cognitive Psychology.

[8]  M. Nissen,et al.  On the development of procedural knowledge. , 1989, Journal of experimental psychology. Learning, memory, and cognition.

[9]  Richard I. Ivry,et al.  Attention and structure in sequence learning. , 1990 .

[10]  James L. McClelland,et al.  Learning the structure of event sequences. , 1991, Journal of experimental psychology. General.

[11]  M. Amorim,et al.  Conscious knowledge and changes in performance in sequence learning: evidence against dissociation. , 1992, Journal of experimental psychology. Learning, memory, and cognition.

[12]  Michael A. Stadler,et al.  Statistical structure and implicit serial learning. , 1992 .

[13]  A. Reber Implicit learning and tacit knowledge , 1993 .

[14]  Tim Curran,et al.  Attentional and Nonattentional Forms of Sequence Learning , 1993 .

[15]  P. Frensch,et al.  Implicit learning of unique and ambiguous serial transitions in the presence and absence of a distractor task. , 1994 .

[16]  D. Shanks,et al.  Characteristics of dissociable human learning systems , 1994, Behavioral and Brain Sciences.

[17]  M. Gluck,et al.  Probabilistic classification learning in amnesia. , 1994, Learning & memory.

[18]  M. Ziessler The impact of motor responses on serial-pattern learning , 1994, Psychological research.

[19]  Peder J. Johnson,et al.  Assessing implicit learning with indirect tests: Determining what is learned about sequence structure. , 1994 .

[20]  Scott T. Grafton,et al.  Functional Mapping of Sequence Learning in Normal Humans , 1995, Journal of Cognitive Neuroscience.

[21]  M. A. Stadler,et al.  Role of attention in implicit learning. , 1995 .

[22]  A. Graybiel Building action repertoires: memory and learning functions of the basal ganglia , 1995, Current Opinion in Neurobiology.

[23]  Axel Cleeremans,et al.  Comparing direct and indirect measures of sequence learning , 1996 .

[24]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[25]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[26]  Axel Cleeremans,et al.  Sequence learning in a dual-stimulus setting , 1997 .

[27]  Peter A. Frensch,et al.  One concept, multiple meanings: On how to define the concept of implicit learning. , 1998 .

[28]  Michael A. Stadler,et al.  Handbook of implicit learning , 1998 .

[29]  Michael Ziessler,et al.  Response-effect learning as a major component of implicit serial learning , 1998 .

[30]  D. Willingham A Neuropsychological Theory of Motor Skill Learning , 2004 .

[31]  Roger W. Schvaneveldt,et al.  Attention and probabilistic sequence learning , 1998 .

[32]  Patricia M. Berretty,et al.  On the dominance of unidimensional rules in unsupervised categorization , 1999, Perception & psychophysics.

[33]  Nir Vulkan An Economist's Perspective on Probability Matching , 2000 .

[34]  F. Ashby,et al.  The effects of concurrent task interference on category learning: Evidence for multiple category learning systems , 2001, Psychonomic bulletin & review.

[35]  L. Brooks,et al.  Specializing the operation of an explicit rule , 1991 .

[36]  M. Gluck,et al.  Interactive memory systems in the human brain , 2001, Nature.

[37]  M. Ziessler,et al.  Learning of event sequences is based on response-effect learning: further evidence from a serial reaction task. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[38]  B. Knowlton,et al.  Learning and memory functions of the Basal Ganglia. , 2002, Annual review of neuroscience.

[39]  Hilde Haider,et al.  Why aggregated learning follows the power law of practice when individual learning does not: comment on Rickard (1997, 1999), Delaney et al. (1998), and Palmeri (1999). , 2002, Journal of experimental psychology. Learning, memory, and cognition.

[40]  W. Estes,et al.  Traps in the route to models of memory and decision , 2002, Psychonomic bulletin & review.

[41]  S. Keele,et al.  The cognitive and neural architecture of sequence representation. , 2003, Psychological review.

[42]  Dieter Nattkemper,et al.  The role of anticipation and intention in the learning of effects of self-performed actions , 2004, Psychological research.

[43]  R. Sun,et al.  The interaction of the explicit and the implicit in skill learning: a dual-process approach. , 2005, Psychological review.

[44]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[45]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[46]  Charles R. Gallistel,et al.  Deconstructing the law of effect , 2005, Games Econ. Behav..

[47]  John R. Anderson,et al.  From recurrent choice to skill learning: a reinforcement-learning model. , 2006, Journal of experimental psychology. General.

[48]  Axel Cleeremans,et al.  Direct and Indirect Measures of Implicit Learning , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.