Reinforcement Learning of Two-Joint Virtual Arm Reaching in a Computer Model of Sensorimotor Cortex

Neocortical mechanisms of learning sensorimotor control involve a complex series of interactions at multiple levels, from synaptic mechanisms to cellular dynamics to network connectomics. We developed a model of sensory and motor neocortex consisting of 704 spiking model neurons. Sensory and motor populations included excitatory cells and two types of interneurons. Neurons were interconnected with AMPA/NMDA and GABAA synapses. We trained our model using spike-timing-dependent reinforcement learning to control a two-joint virtual arm to reach to a fixed target. For each of 125 trained networks, we used 200 training sessions, each involving 15 s reaches to the target from 16 starting positions. Learning altered network dynamics, with enhancements to neuronal synchrony and behaviorally relevant information flow between neurons. After learning, networks demonstrated retention of behaviorally relevant memories by using proprioceptive information to perform reach-to-target from multiple starting positions. Networks dynamically controlled which joint rotations to use to reach a target, depending on current arm position. Learning-dependent network reorganization was evident in both sensory and motor populations: learned synaptic weights showed target-specific patterning optimized for particular reach movements. Our model embodies an integrative hypothesis of sensorimotor cortical learning that could be used to interpret future electrophysiological data recorded in vivo from sensorimotor learning experiments. We used our model to make the following predictions: learning enhances synchrony in neuronal populations and behaviorally relevant information flow across neuronal populations, enhanced sensory processing aids task-relevant motor performance and the relative ease of a particular movement in vivo depends on the amount of sensory information required to complete the movement.

[1]  D. Corbetta,et al.  Seeing and touching: the role of sensory-motor experience on the development of infant reaching. , 2009, Infant behavior & development.

[2]  Ana Pekanovic,et al.  Dopaminergic Projections from Midbrain to Primary Motor Cortex Mediate Motor Skill Learning , 2011, The Journal of Neuroscience.

[3]  M. Farries,et al.  Reinforcement learning with modulated spike timing dependent synaptic plasticity. , 2007, Journal of neurophysiology.

[4]  Michael J. Frank,et al.  A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. , 2006, Behavioral neuroscience.

[5]  A. P. Bannister,et al.  Inter- and intra-laminar connections of pyramidal cells in the neocortex , 2005, Neuroscience Research.

[6]  A. Thomson,et al.  Interlaminar connections in the neocortex. , 2003, Cerebral cortex.

[7]  J. Krakauer,et al.  A computational neuroanatomy for motor control , 2008, Experimental Brain Research.

[8]  Perry L. Miller,et al.  Application of Technology: ModelDB: An Environment for Running and Storing Computational Models and Their Results Applied to Neuroscience , 1996, J. Am. Medical Informatics Assoc..

[9]  Dominique L. Pritchett,et al.  Cued Spatial Attention Drives Functionally Relevant Modulation of the Mu Rhythm in Primary Somatosensory Cortex , 2010, The Journal of Neuroscience.

[10]  William W. Lytton,et al.  Rule-based firing for network simulations , 2006, Neurocomputing.

[11]  Justin C. Sanchez,et al.  DARPA-funded efforts in the development of novel brain–computer interface technologies , 2015, Journal of Neuroscience Methods.

[12]  A M Graybiel,et al.  The basal ganglia and adaptive motor control. , 1994, Science.

[13]  Byron M. Yu,et al.  Single-Trial Neural Correlates of Arm Movement Preparation , 2011, Neuron.

[14]  William W Lytton,et al.  Tonic-Clonic Transitions in Computer Simulation , 2007, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[15]  R. Muller,et al.  Attention-Like Modulation of Hippocampus Place Cell Discharge , 2010, The Journal of Neuroscience.

[16]  W. Precht The synaptic organization of the brain G.M. Shepherd, Oxford University Press (1975). 364 pp., £3.80 (paperback) , 1976, Neuroscience.

[17]  S. Sober,et al.  Adult birdsong is actively maintained by error correction , 2009, Nature Neuroscience.

[18]  E. Pastalkova,et al.  Storage of Spatial Information by the Maintenance Mechanism of LTP , 2006, Science.

[19]  Mingzhou Ding,et al.  Attentional Modulation of Alpha Oscillations in Macaque Inferotemporal Cortex , 2011, The Journal of Neuroscience.

[20]  Samuel A. Neymotin,et al.  Color opponent receptive fields self-organize in a biophysical model of visual cortex via spike-timing dependent plasticity , 2014, Front. Neural Circuits.

[21]  Nicholas T. Carnevale,et al.  The NEURON Book: Epilogue , 2006 .

[22]  Jeffrey L. Krichmar,et al.  Brain-Based Devices for the Study of Nervous Systems and the Development of Intelligent Machines , 2005, Artificial Life.

[23]  Ch. von der Malsburg,et al.  A neural cocktail-party processor , 1986, Biological Cybernetics.

[24]  Markus Diesmann,et al.  A Spiking Neural Network Model of an Actor-Critic Learning Agent , 2009, Neural Computation.

[25]  Razvan V. Florian,et al.  Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity , 2007, Neural Computation.

[26]  Michael L. Hines,et al.  The virtual slice setup , 2008, Journal of Neuroscience Methods.

[27]  Paul H. E. Tiesinga,et al.  Rapid Temporal Modulation of Synchrony by Competition in Cortical Interneuron Networks , 2004, Neural Computation.

[28]  Nicolas Le Novère,et al.  The long journey to a Systems Biology of neuronal function , 2007, BMC systems biology.

[29]  Olaf Sporns,et al.  Mapping Information Flow in Sensorimotor Networks , 2006, PLoS Comput. Biol..

[30]  José Carlos Príncipe,et al.  Repairing lesions via kernel adaptive inverse control in a biomimetic model of sensorimotor cortex , 2015, 2015 7th International IEEE/EMBS Conference on Neural Engineering (NER).

[31]  Olaf Sporns,et al.  The Human Connectome: A Structural Description of the Human Brain , 2005, PLoS Comput. Biol..

[32]  Joseph Thachil Francis,et al.  Erasing Sensorimotor Memories via PKMζ Inhibition , 2010, PloS one.

[33]  W. Lytton Computer modelling of epilepsy , 2008, Nature Reviews Neuroscience.

[34]  William W Lytton,et al.  Training oscillatory dynamics with spike-timing-dependent plasticity in a computer model of neocortex , 2011, 2011 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[35]  W. Singer,et al.  Neural Synchrony in Brain Disorders: Relevance for Cognitive Dysfunctions and Pathophysiology , 2006, Neuron.

[36]  Wenjie Zhang,et al.  Towards real-time communication between in vivo neurophysiological data sources and simulator-based brain biomimetic models , 2014, Journal of computational surgery.

[37]  Patrick D. Roberts,et al.  Spike timing dependent synaptic plasticity in biological systems , 2002, Biological Cybernetics.

[38]  P. Glimcher Value-Based Decision Making , 2014 .

[39]  William W. Lytton,et al.  Emergence of Physiological Oscillation Frequencies in a Computer Model of Neocortex , 2011, Front. Comput. Neurosci..

[40]  Benny Shanon,et al.  The embodiment of mind , 2002 .

[41]  A. Fenton,et al.  Dynamic Grouping of Hippocampal Neural Activity During Cognitive Control of Two Spatial Frames , 2010, PLoS biology.

[42]  William W Lytton,et al.  Multiscale modeling for clinical translation in neuropsychiatric disease , 2014, Journal of computational surgery.

[43]  R. Cools Dopaminergic modulation of cognitive function-implications for l-DOPA treatment in Parkinson's disease , 2006, Neuroscience & Biobehavioral Reviews.

[44]  G. Edelman Neural Darwinism: The Theory Of Neuronal Group Selection , 1989 .

[45]  Shenfeng Qiu,et al.  Circuit-Specific Intracortical Hyperconnectivity in Mice with Deletion of the Autism-Associated Met Receptor Tyrosine Kinase , 2011, The Journal of Neuroscience.

[46]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[47]  John N. J. Reynolds,et al.  Dopamine-dependent plasticity of corticostriatal synapses , 2002, Neural Networks.

[48]  N. Berthier,et al.  Proximodistal structure of early reaching in human infants , 1999, Experimental Brain Research.

[49]  Justin C. Sanchez,et al.  A Symbiotic Brain-Machine Interface through Value-Based Decision Making , 2011, PloS one.

[50]  H. Seung,et al.  Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.

[51]  E. Izhikevich Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.

[52]  Samuel A. Neymotin,et al.  Synaptic Scaling Balances Learning in a Spiking Model of Neocortex , 2013, ICANNGA.

[53]  Neil E. Berthier,et al.  The Syntax of Human Infant Reaching , 2022 .

[54]  J. Sanes Neocortical mechanisms in motor learning , 2003, Current Opinion in Neurobiology.

[55]  Boris Gourévitch,et al.  Evaluating information transfer between auditory cortical neurons. , 2007, Journal of neurophysiology.

[56]  Yun Wang,et al.  Synaptic connections and small circuits involving excitatory and inhibitory neurons in layers 2-5 of adult rat and cat neocortex: triple intracellular recordings and biocytin labelling in vitro. , 2002, Cerebral cortex.

[57]  G. Edelman,et al.  Behavioral constraints in the development of neuronal properties: a cortical model embedded in a real-world device. , 1998, Cerebral cortex.

[58]  William W. Lytton,et al.  A rule-based firing model for neural networks , 2005 .

[59]  L. Tsimring,et al.  Topological determinants of epileptogenesis in large-scale structural and functional models of the dentate gyrus derived from experimental data. , 2007, Journal of neurophysiology.

[60]  Jeanette Kotaleski,et al.  The Effects of NMDA Subunit Composition on Calcium Influx and Spike Timing-Dependent Plasticity in Striatal Medium Spiny Neurons , 2012, PLoS Comput. Biol..

[61]  S. Schaal The Computational Neurobiology of Reaching and Pointing — A Foundation for Motor Learning by Reza Shadmehr and Steven P. Wise , 2007 .

[62]  L. Abbott,et al.  Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.

[63]  R. H. White,et al.  Competitive Hebbian learning , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[64]  L. Finkel,et al.  Ketamine Disrupts Theta Modulation of Gamma in a Computer Model of Hippocampus , 2011, The Journal of Neuroscience.

[65]  A. Luft,et al.  Dopaminergic signals in primary motor cortex , 2009, International Journal of Developmental Neuroscience.

[66]  Kae Nakamura,et al.  Central mechanisms of motor skill learning , 2002, Current Opinion in Neurobiology.

[67]  Michael L. Hines,et al.  Just-in-Time Connectivity for Large Spiking Networks , 2008, Neural Computation.

[68]  R Clay Reid,et al.  From Functional Architecture to Functional Connectomics , 2012, Neuron.

[69]  W. Lytton,et al.  Reinforcement Learning of Targeted Movement in a Spiking Neuronal Model of Motor Cortex , 2012, PloS one.

[70]  J. T. Francis,et al.  Electrostimulation as a Prosthesis for Repair of Information Flow in a Computer Model of Neocortex , 2012, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[71]  Halil Özcan Gülçür,et al.  Toward Building Hybrid Biological/in silico Neural Networks for Motor Neuroprosthetic Control , 2015, Front. Neurorobot..

[72]  E. Thorndike “Animal Intelligence” , 1898, Nature.

[73]  William W. Lytton,et al.  Cortical information flow in Parkinson's disease: a composite network/field model , 2013, Front. Comput. Neurosci..

[74]  S P Wise,et al.  Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. , 1995, Cerebral cortex.

[75]  William W. Lytton,et al.  Synaptic information transfer in computer models of neocortical columns , 2011, Journal of Computational Neuroscience.

[76]  P. Greengard,et al.  Dichotomous Dopaminergic Control of Striatal Synaptic Plasticity , 2008, Science.

[77]  B. Webb What does robotics offer animal behaviour? , 2000, Animal Behaviour.

[78]  Lubica Kubikova,et al.  Dopaminergic system in birdsong learning and maintenance , 2010, Journal of Chemical Neuroanatomy.

[79]  K. Molina-Luna,et al.  Dopamine in Motor Cortex Is Necessary for Skill Learning and Synaptic Plasticity , 2009, PloS one.