Dopamine and self-directed learning

Humans are intrinsically motivated to learn. Such motivation is necessary to be a human-like learner, and helpful for any learning system designed to achieve general intelligence. We discuss the limited existing computational work in this area, and link them to known and theorized properties of the dopamine system. The relatively well-understood mechanisms by which dopamine release signals unpredicted reward can also serve to signal new learning. Dopamine release leads to maintenance of current representations, which serves to “lock” attention onto topics or tasks in which useful learning is occurring. We thus propose a novel but natural extension of known aspects of dopamine function to perform self-directed learning of arbitrary self-defined tasks. If this hypothesis is correct, detailed experimental evidence on dopamine function can help guide computational research into human-like learning systems.

[1]  Jonathan D. Cohen,et al.  An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. , 2005, Annual review of neuroscience.

[2]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[5]  J. Seamans,et al.  The principal features and mechanisms of dopamine modulation in the prefrontal cortex , 2004, Progress in Neurobiology.

[6]  Thomas E. Hazy,et al.  Banishing the homunculus: Making working memory work , 2006, Neuroscience.

[7]  Juyang Weng,et al.  Inherent Value Systems for Autonomous Mental Development , 2007, Int. J. Humanoid Robotics.

[8]  M. Gluck,et al.  Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning. , 2004, Journal of neurophysiology.

[9]  M. Merzenich,et al.  Cortical remodelling induced by activity of ventral tegmental dopamine neurons , 2001, Nature.

[10]  Y. Munakata,et al.  Active versus latent representations: a neural network model of perseveration, dissociation, and decalage. , 2002, Developmental psychobiology.

[11]  Thomas E. Hazy,et al.  PVLV: the primary value and learned value Pavlovian learning algorithm. , 2007, Behavioral neuroscience.

[12]  J. Stevenson The cultural origins of human cognition , 2001 .

[13]  Marie T. Banich,et al.  Neural Mechanisms of Cognitive Control: An Integrative Model of Stroop Task Performance and fMRI Data , 2006, Journal of Cognitive Neuroscience.

[14]  Thomas E. Hazy,et al.  Neural mechanisms of acquired phasic dopamine responses in learning , 2010, Neuroscience & Biobehavioral Reviews.

[15]  T. Robbins,et al.  Dopamine Modulation of the Prefrontal Cortex and Cognitive Function , 2010 .

[16]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[17]  E. Rolls,et al.  Attention, short-term memory, and action selection: A unifying theory , 2005, Progress in Neurobiology.