Action Discovery and Intrinsic Motivation: A Biologically Constrained Formalisation

We introduce a biologically motivated, formal framework or “ontology” for dealing with many aspects of action discovery which we argue is an example of intrinsically motivated behaviour (as such, this chapter is a companion to that by Redgrave et al. in this volume). We argue that action discovery requires an interplay between separate internal forward models of prediction and inverse models mapping outcomes to actions. The process of learning actions is driven by transient changes in the animal’s policy (repetition bias) which is, in turn, a result of unpredicted, phasic sensory information (“surprise”). The notion of salience as value is introduced and broken down into contributions from novelty (or surprise), immediate reward acquisition, or general task/goal attainment. Many other aspects of biological action discovery emerge naturally in our framework which aims to guide future modelling efforts in this domain.

[1]  N. P. Bichot,et al.  Frontal eye field activity before visual search errors reveals the integration of bottom-up and top-down salience. , 2005, Journal of neurophysiology.

[2]  A. Allport,et al.  Selection for action: Some behavioral and neurophysiological considerations of attention and action , 1987 .

[3]  Thomas R. Gruber,et al.  A Translation Approach to Portable Ontologies , 1993 .

[4]  P. Redgrave,et al.  Cortico-striatal plasticity for action-outcome learning using spike timing dependent eligibility , 2009, BMC Neuroscience.

[5]  G A Lucas,et al.  The basis of superstitious behavior: chance contingency, stimulus substitution, or appetitive behavior? , 1985, Journal of the experimental analysis of behavior.

[6]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[7]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[8]  Martin V. Butz,et al.  Anticipatory Behavior in Adaptive Learning Systems , 2003, Lecture Notes in Computer Science.

[9]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[10]  W. Schultz Dopamine signals for reward value and risk: basic and recent data , 2010, Behavioral and Brain Functions.

[11]  Tomaso Poggio,et al.  From Understanding Computation to Understanding Neural Circuitry , 1976 .

[12]  J. Kalaska,et al.  Neural mechanisms for interacting with a world full of action choices. , 2010, Annual review of neuroscience.

[13]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[14]  Jürgen Schmidhuber,et al.  Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[15]  J. Mayhew,et al.  How Visual Stimuli Activate Dopaminergic Neurons at Short Latency , 2005, Science.

[16]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[17]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[18]  Peter Redgrave,et al.  A direct projection from superior colliculus to substantia nigra for detecting salient visual events , 2003, Nature Neuroscience.

[19]  Konrad Paul Kording,et al.  Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .

[20]  M Schleidt,et al.  Segmentation in behavior and what it can tell us about brain function , 1997, Human nature.

[21]  H. Heuer,et al.  Perspectives on Perception and Action , 1989 .

[22]  S. Yantis,et al.  Visual Attention: Bottom-Up Versus Top-Down , 2004, Current Biology.

[23]  T. Poggio,et al.  Ill-posed problems in early vision: from computational theory to analogue networks , 1985, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[24]  H. Yin,et al.  The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[25]  Jonathan M Chambers,et al.  Object-based biasing for attentional control of gaze: a comparison of biologically plausible mechanisms , 2009, BMC Neuroscience.

[26]  Pierre Baldi,et al.  Of bits and wows: A Bayesian theory of surprise with applications to attention , 2010, Neural Networks.

[27]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[28]  Okihide Hikosaka,et al.  Reward-Dependent Gain and Bias of Visual Responses in Primate Superior Colliculus , 2003, Neuron.

[29]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[30]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[31]  Amir Hussain,et al.  Controlled and Automatic Processing in Animals and Machines with Application to Autonomous Vehicle Control , 2009, ICANN.

[32]  Marco Mirolli,et al.  Intrinsically Motivated Learning in Natural and Artificial Systems , 2013 .

[33]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[34]  van der Steen The timing of movements and the role of the basal ganglia , 2013 .

[35]  G. Rainer,et al.  Cognitive neuroscience: Neural mechanisms for detecting and remembering novel events , 2003, Nature Reviews Neuroscience.

[36]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[37]  J. E. Albano,et al.  Visual-motor function of the primate superior colliculus. , 1980, Annual review of neuroscience.

[38]  R. Andersen,et al.  Coding of intention in the posterior parietal cortex , 1997, Nature.

[39]  P. Redgrave,et al.  What is reinforced by phasic dopamine signals? , 2008, Brain Research Reviews.

[40]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[41]  E. N. Sokolov Higher nervous functions; the orienting reflex. , 1963, Annual review of physiology.

[42]  Kevin Gurney,et al.  The Role of the Basal Ganglia in Discovering Novel Actions , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[43]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[44]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[45]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[46]  Paul Cisek,et al.  Cortical mechanisms of action selection: the affordance competition hypothesis , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.