论文信息 - Action Selection and Operant Conditioning: A Neurorobotic Implementation

Action Selection and Operant Conditioning: A Neurorobotic Implementation

Action selection (AS) is thought to represent the mechanism involved by natural agents when deciding what should be the next move or action. Is there a functional elementary core sustaining this cognitive process? Could we reproduce the mechanism with an artificial agent and more specifically in a neurorobotic paradigm? Unsupervised autonomous robots may require a decision-making skill to evolve in the real world and the bioinspired approach is the avenue explored through this paper. We propose simulating an AS process by using a small spiking neural network (SNN) as the lower neural organisms, in order to control virtual and physical robots. We base our AS process on a simple central pattern generator (CPG), decision neurons, sensory neurons, and motor neurons as the main circuit components. As novelty, this study targets a specific operant conditioning (OC) context which is relevant in an AS process; choices do influence future sensory feedback. Using a simple adaptive scenario, we show the complementarity interaction of both phenomena. We also suggest that this AS kernel could be a fast track model to efficiently design complex SNN which include a growing number of input stimuli and motor outputs. Our results demonstrate that merging AS and OC brings flexibility to the behavior in generic dynamical situations.

André Cyr | Frédéric Thériault | André Cyr | Frédéric Thériault

[1] Rainer Breitling,et al. A circuit model of the temporal pattern generator of Caenorhabditis egg-laying behavior , 2010, BMC Systems Biology.

[2] D. A. Baxter,et al. Operant Reward Learning in Aplysia: Neuronal Correlates and Mechanisms , 2002, Science.

[3] R. Bogacz,et al. The neural basis of the speed–accuracy tradeoff , 2010, Trends in Neurosciences.

[4] Y. Dan,et al. Spike timing-dependent plasticity: a Hebbian learning rule. , 2008, Annual review of neuroscience.

[5] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.

[6] Wulfram Gerstner,et al. A History of Spike-Timing-Dependent Plasticity , 2011, Front. Syn. Neurosci..

[7] Mark D. Humphries,et al. A robot model of the basal ganglia: Behavior and intrinsic processing , 2006, Neural Networks.

[8] P. Redgrave,et al. The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[9] R. Calabrese,et al. Patterns of Presynaptic Activity and Synaptic Strength Interact to Produce Motor Output , 2011, The Journal of Neuroscience.

[10] S. Lockery,et al. Neuronal microcircuits for decision making in C. elegans , 2012, Current Opinion in Neurobiology.

[11] P. Cisek. Making decisions through a distributed consensus , 2012, Current Opinion in Neurobiology.

[12] Kevin M. Crisp,et al. Beyond the central pattern generator: amine modulation of decision-making neural pathways descending from the brain of the medicinal leech , 2006, Journal of Experimental Biology.

[13] Auke Jan Ijspeert,et al. Central pattern generators for locomotion control in animals and robots: A review , 2008, Neural Networks.

[14] James L. McClelland,et al. The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.

[15] A. Leonardo,et al. A spike-timing mechanism for action selection , 2014, Nature Neuroscience.

[16] Gunnar Blohm,et al. Neural dynamics implement a flexible decision bound with a fixed firing rate for choice: a model-based hypothesis , 2014, Front. Neurosci..

[17] Terry Wrigley. Selection, selection, selection , 2006 .

[18] Kevin Gurney,et al. Optimal decision-making in mammals: insights from a robot study of rodent texture discrimination , 2012, Journal of The Royal Society Interface.

[19] Tony J. Prescott. Action selection , 2008, Scholarpedia.

[20] Mounir Boukadoum,et al. Classical conditioning in different temporal constraints: an STDP learning rule for robots controlled by spiking neural networks , 2012, Adapt. Behav..

[21] S. Grillner. Neurobiological bases of rhythmic motor acts in vertebrates. , 1985, Science.

[22] M. Bazhenov,et al. A Spiking Network Model of Decision Making Employing Rewarded STDP , 2014, PloS one.

[23] Andrew Heathcote,et al. Brain and Behavior in Decision-Making , 2014, PLoS Comput. Biol..

[24] Scott D. Brown,et al. The simplest complete model of choice response time: Linear ballistic accumulation , 2008, Cognitive Psychology.

[25] A. Selverston,et al. Invertebrate central pattern generator circuits , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[26] T. Prescott,et al. Is there a brainstem substrate for action selection? , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[27] P. Cisek,et al. Decisions in Changing Conditions: The Urgency-Gating Model , 2009, The Journal of Neuroscience.

[28] Mehdi Khamassi,et al. Actor–Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats , 2005, Adapt. Behav..

[29] Eugene M. Izhikevich,et al. Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.

[30] Sean A. Rands,et al. Modelling natural action selection , 2012 .

[31] Kiyotoshi Matsuoka,et al. Mechanisms of frequency and pattern control in the neural rhythm generators , 1987, Biological Cybernetics.

[32] Wofgang Maas,et al. Networks of spiking neurons: the third generation of neural network models , 1997 .

[33] C. Eliasmith,et al. Learning to Select Actions with Spiking Neurons in the Basal Ganglia , 2012, Front. Neurosci..

[34] Richard P. Heitz,et al. Neural Mechanisms of Speed-Accuracy Tradeoff , 2012, Neuron.

[35] Dan-Eric Nilsson,et al. Setting the Pace: New Insights into Central Pattern Generator Interactions in Box Jellyfish Swimming , 2011, PloS one.

[36] M. Arshad,et al. Central Pattern Generator in Bio-inspired Robot : Simulation using MATLAB , 2008 .

[37] Mounir Boukadoum,et al. Operant conditioning: a minimal components requirement in artificial spiking neurons designed for bio-inspired robot's controller , 2014, Front. Neurorobot..