Functions and Mechanisms of Intrinsic Motivations The Knowledge Versus Competence Distinction

Mammals, and humans in particular, are endowed with an exceptional capacity for cumulative learning. This capacity crucially depends on the presence of intrinsic motivations, that is, motivations that are directly related not to an organism’s survival and reproduction but rather to its ability to learn. Recently, there have been a number of attempts to model and reproduce intrinsic motivations in artificial systems. Different kinds of intrinsic motivations have been proposed both in psychology and in machine learning and robotics: some are based on the knowledge of the learning system, while others are based on its competence. In this contribution, we discuss the distinction between knowledge-based and competence-based intrinsic motivations with respect to both the functional roles that motivations play in learning and the mechanisms by which those functions are implemented. In particular, after arguing that the principal function of intrinsic motivations consists in allowing the development of a repertoire of skills (rather than of knowledge), we suggest that at least two different sub-functions can be identified: (a) discovering which skills might be acquired and (b) deciding which skill to train when. We propose that in biological organisms, knowledge-based intrinsic motivation mechanisms might implement the former function, whereas competencebased mechanisms might underlie the latter one. M. Mirolli ( ) G. Baldassarre Istituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche, Rome, Italy e-mail: marco.mirolli@istc.cnr.it; gianluca.baldassarre@istc.cnr.it G. Baldassarre and M. Mirolli (eds.), Intrinsically Motivated Learning in Natural and Artificial Systems, DOI 10.1007/978-3-642-32375-1 3, © Springer-Verlag Berlin Heidelberg 2013 49 50 M. Mirolli and G. Baldassarre

[1]  C. L. Hull Principles of Behavior , 1945 .

[2]  Harlow Hf Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. , 1950 .

[3]  H. Harlow,et al.  Learning motivated by a manipulation drive. , 1950, Journal of experimental psychology.

[4]  R. Butler Discrimination learning by rhesus monkeys to visual-exploration motivation. , 1953, Journal of comparative and physiological psychology.

[5]  K. Montgomery The role of the exploratory drive in learning. , 1954, Journal of comparative and physiological psychology.

[6]  D. Hebb Drives and the C.N.S. (conceptual nervous system). , 1955, Psychological review.

[7]  G. B. Kish Learning when the onset of illumination is used as reinforcing stimulus. , 1955, Journal of comparative and physiological psychology.

[8]  G B KISH,et al.  Unconditioned operant behavior in two homozygous strains of mice. , 1956, The Journal of Genetic Psychology.

[9]  W. N. Dember,et al.  Analysis of exploratory, manipulatory, and curiosity behaviors. , 1957, Psychological review.

[10]  L. Festinger,et al.  A Theory of Cognitive Dissonance , 2017 .

[11]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[12]  R. Decharms Personal causation : the internal affective determinants of behavior , 1968 .

[13]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[14]  J. Kagan Motives and development. , 1972, Journal of personality and social psychology.

[15]  E. Deci Cognitive Evaluation Theory: Effects of Extrinsic Rewards on Intrinsic Motivation , 1975 .

[16]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[17]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[18]  G. E. Alexander,et al.  Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[19]  Christopher G. Langton,et al.  Artificial Life: Proceedings Of An Interdisciplinary Workshop On The Synthesis And Simulation Of Living Systems , 1989 .

[20]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[21]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[22]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[23]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[24]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[25]  Satinder Singh Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..

[26]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[27]  D. Cliff From animals to animats , 1994, Nature.

[28]  D. Joel,et al.  The organization of the basal ganglia-thalamocortical circuits: Open interconnected rather than closed segregated , 1994, Neuroscience.

[29]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[30]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[31]  H. Fibiger,et al.  Cortical Regulation of Subcortical Dopamine Release: Mediation via the Ventral Tegmental Area , 1995, Journal of neurochemistry.

[32]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[33]  A. Clark Being There: Putting Brain, Body, and World Together Again , 1996 .

[34]  T. Nokes,et al.  Intrinsic reinforcing properties of putatively neutral stimuli in an instrumental two-lever discrimination task , 1996 .

[35]  J. Mink THE BASAL GANGLIA: FOCUSED SELECTION AND INHIBITION OF COMPETING MOTOR PROGRAMS , 1996, Progress in Neurobiology.

[36]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[37]  Stanley J. Rosenschein,et al.  From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior , 1996 .

[38]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[39]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[40]  Jürgen Schmidhuber,et al.  HQ-Learning , 1997, Adapt. Behav..

[41]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[42]  Stefano Nolfi,et al.  Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems , 1998, Neural Networks.

[43]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[44]  Rolf Pfeifer,et al.  Understanding intelligence , 2020, Inequality by Design.

[45]  A. Dickinson,et al.  Neuronal coding of prediction errors. , 2000, Annual review of neuroscience.

[46]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[47]  D. Plaut,et al.  Doing Without Schema Hierarchies : A Recurrent Connectionist Approach to Routine Sequential Action and Its Pathologies , 2000 .

[48]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[49]  K. Doya Complementary roles of basal ganglia and cerebellum in learning and motor control , 2000, Current Opinion in Neurobiology.

[50]  J. Horvitz Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[51]  E. Miller,et al.  An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[52]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour , 2001, Biological Cybernetics.

[53]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. I. A new functional anatomy , 2001, Biological Cybernetics.

[54]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[55]  Gianluca Baldassarre,et al.  Planning with neural networks and reinforcement learning , 2001 .

[56]  Mitsuo Kawato,et al.  MOSAIC Model for Sensorimotor Learning and Control , 2001, Neural Computation.

[57]  J. Fuster The Prefrontal Cortex—An Update Time Is of the Essence , 2001, Neuron.

[58]  G. Rizzolatti,et al.  The Cortical Motor System , 2001, Neuron.

[59]  Mitsuo Kawato,et al.  Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[60]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[61]  Eytan Ruppin,et al.  Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.

[62]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[63]  Gianluca Baldassarre,et al.  A modular neural-network model of the basal ganglia’s role in learning and selecting motor behaviours , 2002, Cognitive Systems Research.

[64]  H. Groenewegen,et al.  The medial prefrontal cortex in the rat: evidence for a dorso-ventral distinction based upon functional and anatomical characteristics , 2003, Neuroscience & Biobehavioral Reviews.

[65]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[66]  Jürgen Schmidhuber,et al.  Exploring the predictable , 2003 .

[67]  John S. Gero,et al.  Curious agents and situated design evaluations , 2004, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[68]  T. Robbins,et al.  Prefrontal executive and cognitive functions in rodents: neural and neurochemical substrates , 2004, Neuroscience & Biobehavioral Reviews.

[69]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[70]  Douglas S. Blank,et al.  An Emergent Framework For Self-Motivation In Developmental Robotics , 2004 .

[71]  Denis Mareschal,et al.  An Interacting Systems Model of Infant Habituation , 2004, Journal of Cognitive Neuroscience.

[72]  Andrew G. Barto,et al.  Intrinsically Motivated Reinforcement Learning: A Promising Framework for Developmental Robot Learning , 2005 .

[73]  G. Heit,et al.  Somatotopy in the basal ganglia: experimental and clinical evidence for segregated sensorimotor channels , 2005, Brain Research Reviews.

[74]  J. Mayhew,et al.  How Visual Stimuli Activate Dopaminergic Neurons at Short Latency , 2005, Science.

[75]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[76]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[77]  J. Lisman,et al.  The Hippocampal-VTA Loop: Controlling the Entry of Information into Long-Term Memory , 2005, Neuron.

[78]  P. Redgrave,et al.  The short-latency dopamine signal: a role in discovering novel actions? , 2006, Nature Reviews Neuroscience.

[79]  M. Graziano The organization of behavioral repertoire in motor cortex. , 2006, Annual review of neuroscience.

[80]  H. Yin,et al.  The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[81]  Risto Miikkulainen,et al.  Developing navigation behavior through self-organizing distinctive-state abstraction , 2006, Connect. Sci..

[82]  Andrew G. Barto,et al.  Causal Graph Based Decomposition of Factored MDPs , 2006, J. Mach. Learn. Res..

[83]  Peter Redgrave,et al.  Basal Ganglia , 2020, Encyclopedia of Autism Spectrum Disorders.

[84]  G. Baldassarre,et al.  Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[85]  Marco Mirolli,et al.  Evolution and Learning in an Intrinsically Motivated Reinforcement Learning Robot , 2007, ECAL.

[86]  D. S. Zahm,et al.  Glutamatergic Afferents of the Ventral Tegmental Area in the Rat , 2007, The Journal of Neuroscience.

[87]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[88]  Scott T. Grafton,et al.  Evidence for a distributed hierarchy of action representation in the brain. , 2007, Human movement science.

[89]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[90]  Marco Mirolli,et al.  Evolving Childhood's Length and Learning Parameters in an Intrinsically Motivated Reinforcement Learning Robot , 2007 .

[91]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[92]  Jun Tani,et al.  Achieving "organic compositionality" through self-organization: Reviews on brain-inspired robotics experiments , 2008, Neural Networks.

[93]  K. Gurney,et al.  Instrumental Conditioning Driven by Neutral Stimuli : A Model Tested with a Simulated Robotic Rat , 2008 .

[94]  Carsten W. Scherer,et al.  Model-Based Control: , 2009 .

[95]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[96]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[97]  Lisa Meeden,et al.  Category-based intrinsic motivation , 2009, EpiRob.

[98]  Kathryn E. Merrick,et al.  Motivated Learning from Interesting Events: Adaptive, Multitask Learning Agents for Complex Environments , 2009, Adapt. Behav..

[99]  Andrew G. Barto,et al.  Competence progress intrinsic motivation , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[100]  Edward T. Bullmore,et al.  Modular and Hierarchically Modular Organization of Brain Networks , 2010, Front. Neurosci..

[101]  Domenico Parisi,et al.  A Bioinspired Hierarchical Reinforcement Learning Architecture for Modeling Learning of Multiple Skills with Continuous States and Actions , 2010, EpiRob.

[102]  Marco Mirolli,et al.  What are the Key Open Challenges for Understanding Autonomous Cumulative Learning of Skills ? , 2010 .

[103]  Pierre-Yves Oudeyer,et al.  Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[104]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.

[105]  Marco Mirolli,et al.  Biological Cumulative Learning through Intrinsic Motivations: A Simulated Robotic Study on the Development of Visually-Guided Reaching , 2010, EpiRob.

[106]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[107]  Marco Mirolli,et al.  Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study , 2013, Neural Networks.

[108]  Jürgen Schmidhuber,et al.  Maximizing Fun by Creating Data with Easily Reducible Subjective Complexity , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[109]  Marco Mirolli,et al.  Deciding Which Skill to Learn When: Temporal-Difference Competence-Based Intrinsic Motivation (TD-CB-IM) , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[110]  Emrah Düzel,et al.  The Hippocampal-VTA Loop: The Role of Novelty and Motivation in Controlling the Entry of Information into Long-Term Memory , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[111]  Kevin Gurney,et al.  Action Discovery and Intrinsic Motivation: A Biologically Constrained Formalisation , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[112]  Kevin Gurney,et al.  The Role of the Basal Ganglia in Discovering Novel Actions , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[113]  Andrew G. Barto,et al.  Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[114]  D. Berlyne Conflict, arousal, and curiosity , 2014 .