A Model of Basal Ganglia Function Unifying Reinforcement Learning and Action Selection

We propose a systems-level computational model of the basal gangha based closely on known anatomy and physiology. First, we assume that thalamic output &argets of the basal ganglia, which relay ascending information to cortical action and planning areas, are tonically inhibited. Second, we assume that the output stage of the basal ganglia, the internal segment of the globus pallidus (GPi), selects a winner from several potential actions. The por"entia.1 actions are represented as parallel streams of information, each competing for access to the cortical areas that implement them. The requirement for both tonic inhibition of thalamic nuclei and winnerselection leads to a circuit, that in the simplest possible form, has neurons in exactly the configuration found in the basal ganglia. In particular, the subthalamic nucleus, which contains primarily excitatory neurons, is instrumental in implementing a "winner-lose-all" in the globus pallidus, which selects thalamic targets by disinhibition. We combine this winner-selection mechanism with reinforcement learning through dopaminergic neurons in the substantia nigra and the ventral tegmental area to m o w the cortico-striatal synapse efficacy. Using this model, we demonstrate its function on two behavioral tasks. One task, termed the Multiarmed Bandit (MAB) is based on hypothesis testing in an environment with uncertain risks and rewards. The model both mimics the behavior of normal humans performing this task and makes predictions regarding the performance of schizophrenics. It was found to be sensitive to both the level of presynaptic noise in the striatum and the relative weighting of short-term vs. long-term reward. The performance on the second task, the Wisconsin Card Sorting Test (WCST), was found to be sensitive both to the level of presynaptic noise in the striatum, and the ratio of positive to negative learning rates. The model predicts performance both in Parkinson's disease and schizophrenia. These results are discussed within the framework of existing data regarding mechanisms of synaptic plastiaty, axo-dendritic architectures in the basal ganglia, and disease states.

[1]  Michael Davis,et al.  Neurotransmission in the rat amygdala related to fear and anxiety , 1994, Trends in Neurosciences.

[2]  S. Cabib,et al.  Opposite responses of mesolimbic dopamine system to controllable and uncontrollable aversive experiences , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[3]  R. Malenka,et al.  Simultaneous LTP of non-NMDA- and LTD of NMDA-receptor-mediated responses in the nucleus accumbens , 1994, Nature.

[4]  JW Polli,et al.  Expression of a calmodulin-dependent phosphodiesterase isoform (PDE1B1) correlates with brain regions having extensive dopaminergic innervation , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[5]  A. Flaherty,et al.  Input-output organization of the sensorimotor striatum in the squirrel monkey , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[6]  Terrence J. Sejnowski,et al.  Foraging in an Uncertain Environment Using Predictive Hebbian Learning , 1993, NIPS.

[7]  John T. Williams,et al.  Dopamine D1 receptors facilitate transmitter release , 1993, Nature.

[8]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[9]  A. Parent,et al.  Anatomical aspects of information processing in primate basal ganglia , 1993, Trends in Neurosciences.

[10]  P. Strick,et al.  Multiple output channels in the basal ganglia. , 1993, Science.

[11]  W. Schultz,et al.  Neuronal activity in monkey ventral striatum related to the expectation of reward , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[12]  P. Calabresi,et al.  Long-term synaptic depression in the striatum: physiological and pharmacological characterization , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[13]  André Parent,et al.  Convergence of subthalamic and striatal efferents at pallidal level in primates: an anterograde double-labeling study with biocytin and PHA-L , 1992, Brain Research.

[14]  S. Young,et al.  Terminal excitability of the corticostrial pathway. I. Regulation by dopamine receptor stimulation , 1991, Brain Research.

[15]  J. Hedreen,et al.  Organization of striatopallidal, striatonigral, and nigrostriatal projections in the macaque , 1991, The Journal of comparative neurology.

[16]  A. Graybiel Neurotransmitters and neuromodulators in the basal ganglia , 1990, Trends in Neurosciences.

[17]  G. E. Alexander,et al.  Functional architecture of basal ganglia circuits: neural substrates of parallel processing , 1990, Trends in Neurosciences.

[18]  M. Jouvet,et al.  Lower brainstem afferents to the cat posterior hypothalamus: A double-labeling study , 1990, Brain Research Bulletin.

[19]  Irwin J. Kopin,et al.  The Biochemical Basis of Neuropharmacology , 1971, The Yale Journal of Biology and Medicine.

[20]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[21]  G. E. Alexander,et al.  Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[22]  P. Goldman-Rakic,et al.  Topography of Corticostriatal Projections in Nonhuman Primates and Implications for Functional Parcellation of the Neostriatum , 1986 .

[23]  G. Shepherd The Synaptic Organization of the Brain , 1979 .