Action Selection and Operant Conditioning: A Neurorobotic Implementation

Action selection (AS) is thought to represent the mechanism involved by natural agents when deciding what should be the next move or action. Is there a functional elementary core sustaining this cognitive process? Could we reproduce the mechanism with an artificial agent and more specifically in a neurorobotic paradigm? Unsupervised autonomous robots may require a decision-making skill to evolve in the real world and the bioinspired approach is the avenue explored through this paper. We propose simulating an AS process by using a small spiking neural network (SNN) as the lower neural organisms, in order to control virtual and physical robots. We base our AS process on a simple central pattern generator (CPG), decision neurons, sensory neurons, and motor neurons as the main circuit components. As novelty, this study targets a specific operant conditioning (OC) context which is relevant in an AS process; choices do influence future sensory feedback. Using a simple adaptive scenario, we show the complementarity interaction of both phenomena. We also suggest that this AS kernel could be a fast track model to efficiently design complex SNN which include a growing number of input stimuli and motor outputs. Our results demonstrate that merging AS and OC brings flexibility to the behavior in generic dynamical situations.

[1]  Rainer Breitling,et al.  A circuit model of the temporal pattern generator of Caenorhabditis egg-laying behavior , 2010, BMC Systems Biology.

[2]  D. A. Baxter,et al.  Operant Reward Learning in Aplysia: Neuronal Correlates and Mechanisms , 2002, Science.

[3]  R. Bogacz,et al.  The neural basis of the speed–accuracy tradeoff , 2010, Trends in Neurosciences.

[4]  Y. Dan,et al.  Spike timing-dependent plasticity: a Hebbian learning rule. , 2008, Annual review of neuroscience.

[5]  J. Gold,et al.  The neural basis of decision making. , 2007, Annual review of neuroscience.

[6]  Wulfram Gerstner,et al.  A History of Spike-Timing-Dependent Plasticity , 2011, Front. Syn. Neurosci..

[7]  Mark D. Humphries,et al.  A robot model of the basal ganglia: Behavior and intrinsic processing , 2006, Neural Networks.

[8]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[9]  R. Calabrese,et al.  Patterns of Presynaptic Activity and Synaptic Strength Interact to Produce Motor Output , 2011, The Journal of Neuroscience.

[10]  S. Lockery,et al.  Neuronal microcircuits for decision making in C. elegans , 2012, Current Opinion in Neurobiology.

[11]  P. Cisek Making decisions through a distributed consensus , 2012, Current Opinion in Neurobiology.

[12]  Kevin M. Crisp,et al.  Beyond the central pattern generator: amine modulation of decision-making neural pathways descending from the brain of the medicinal leech , 2006, Journal of Experimental Biology.

[13]  Auke Jan Ijspeert,et al.  Central pattern generators for locomotion control in animals and robots: A review , 2008, Neural Networks.

[14]  James L. McClelland,et al.  The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.

[15]  A. Leonardo,et al.  A spike-timing mechanism for action selection , 2014, Nature Neuroscience.

[16]  Gunnar Blohm,et al.  Neural dynamics implement a flexible decision bound with a fixed firing rate for choice: a model-based hypothesis , 2014, Front. Neurosci..

[17]  Terry Wrigley Selection, selection, selection , 2006 .

[18]  Kevin Gurney,et al.  Optimal decision-making in mammals: insights from a robot study of rodent texture discrimination , 2012, Journal of The Royal Society Interface.

[19]  Tony J. Prescott Action selection , 2008, Scholarpedia.

[20]  Mounir Boukadoum,et al.  Classical conditioning in different temporal constraints: an STDP learning rule for robots controlled by spiking neural networks , 2012, Adapt. Behav..

[21]  S. Grillner Neurobiological bases of rhythmic motor acts in vertebrates. , 1985, Science.

[22]  M. Bazhenov,et al.  A Spiking Network Model of Decision Making Employing Rewarded STDP , 2014, PloS one.

[23]  Andrew Heathcote,et al.  Brain and Behavior in Decision-Making , 2014, PLoS Comput. Biol..

[24]  Scott D. Brown,et al.  The simplest complete model of choice response time: Linear ballistic accumulation , 2008, Cognitive Psychology.

[25]  A. Selverston,et al.  Invertebrate central pattern generator circuits , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[26]  T. Prescott,et al.  Is there a brainstem substrate for action selection? , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[27]  P. Cisek,et al.  Decisions in Changing Conditions: The Urgency-Gating Model , 2009, The Journal of Neuroscience.

[28]  Mehdi Khamassi,et al.  Actor–Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats , 2005, Adapt. Behav..

[29]  Eugene M. Izhikevich,et al.  Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.

[30]  Sean A. Rands,et al.  Modelling natural action selection , 2012 .

[31]  Kiyotoshi Matsuoka,et al.  Mechanisms of frequency and pattern control in the neural rhythm generators , 1987, Biological Cybernetics.

[32]  Wofgang Maas,et al.  Networks of spiking neurons: the third generation of neural network models , 1997 .

[33]  C. Eliasmith,et al.  Learning to Select Actions with Spiking Neurons in the Basal Ganglia , 2012, Front. Neurosci..

[34]  Richard P. Heitz,et al.  Neural Mechanisms of Speed-Accuracy Tradeoff , 2012, Neuron.

[35]  Dan-Eric Nilsson,et al.  Setting the Pace: New Insights into Central Pattern Generator Interactions in Box Jellyfish Swimming , 2011, PloS one.

[36]  M. Arshad,et al.  Central Pattern Generator in Bio-inspired Robot : Simulation using MATLAB , 2008 .

[37]  Mounir Boukadoum,et al.  Operant conditioning: a minimal components requirement in artificial spiking neurons designed for bio-inspired robot's controller , 2014, Front. Neurorobot..