A novel hypothalamic-midbrain circuit for model-based learning

Behavior is often dichotomized into model-free and model-based systems 1, 2. Model-free behavior prioritizes associations that have high value, regardless of the specific consequence or circumstance. In contrast, model-based behavior involves considering all possible outcomes to produce behavior that best fits the current circumstance. We typically exhibit a mixture of these behaviors so we can trade-off efficiency and flexibility. However, substance use disorder shifts behavior more strongly towards model-free systems, which produces a difficulty abstaining from drug-seeking due to an inability to withhold making the model-free high-value response 3–10. The lateral hypothalamus (LH) is implicated in substance use disorder 11–17 and we have demonstrated that this region is critical to Pavlovian cue-reward learning 18, 19. However, it is unknown whether learning occurring in LH is model-free or model-based, where the necessary teaching signal comes from to facilitate learning in LH, and whether this is relevant for learning deficits that drive substance use disorder. Here, we reveal that learning occurring in the LH is model-based. Further, we confirm the existence of an understudied projection extending from dopamine neurons in the ventral tegmental area (VTA) to the LH and demonstrate that this input underlies model-based learning in LH. Finally, we examine the impact of methamphetamine self-administration on LH-dependent model-based processes. These experiments reveal that a history of methamphetamine administration enhances the model-based control that Pavlovian cues have over decision-making, which was accompanied by a bidirectional strengthening of the LH to VTA circuit. Together, this work reveals a novel bidirectional circuit that underlies model-based learning and is relevant to the behavioral and cognitive changes that arise with substance use disorders. This circuit represents a new addition to models of addiction, which focus on instrumental components of drug addiction and increases in model-free habits after drug exposure 3–10.

[1]  Stefan Mihalas,et al.  Mesolimbic dopamine release conveys causal associations , 2022, Science.

[2]  Caitlin M Goodpaster,et al.  Dopamine projections to the basolateral amygdala drive the encoding of identity-specific reward memories , 2022, bioRxiv.

[3]  David J. Barker,et al.  The cognitive basis of intracranial self-stimulation of midbrain dopamine neurons , 2022, bioRxiv.

[4]  A. Blaisdell,et al.  Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner , 2022, Current Biology.

[5]  M. Sharpe,et al.  The basolateral amygdala and lateral hypothalamus bias learning towards motivationally significant events , 2021, Current Opinion in Behavioral Sciences.

[6]  C. Gremel,et al.  Prior Chronic Alcohol Exposure Enhances Pavlovian-to-Instrumental Transfer. , 2021, Alcohol.

[7]  K. Wassum,et al.  The Medial Orbitofrontal Cortex–Basolateral Amygdala Circuit Regulates the Influence of Reward Cues on Adaptive Behavior and Choice , 2021, The Journal of Neuroscience.

[8]  Andrew M. Wikenheiser,et al.  A bidirectional corticoamygdala circuit for the encoding and retrieval of detailed reward memories , 2021, bioRxiv.

[9]  H. Walter,et al.  Association of the OPRM1 A118G polymorphism and Pavlovian-to-instrumental transfer: Clinical relevance for alcohol dependence , 2021, Journal of psychopharmacology.

[10]  Matthew P. H. Gardner,et al.  Past experience shapes the neural circuits recruited for future learning , 2021, Nature Neuroscience.

[11]  Andrew M. Wikenheiser,et al.  Prior Cocaine Use Alters the Normal Evolution of Information Coding in Striatal Ensembles during Value-Guided Decision-Making , 2020, The Journal of Neuroscience.

[12]  S. H. Ahmed,et al.  Habit, choice, and addiction , 2020, Neuropsychopharmacology.

[13]  Matthew P. H. Gardner,et al.  Responding to preconditioned cues is devaluation sensitive and requires orbitofrontal cortex during cue-cue learning , 2020, eLife.

[14]  Hannah M. Batchelor,et al.  Dopamine transients do not act as model-free prediction errors during associative learning , 2020, Nature Communications.

[15]  L. Hogarth Addiction is driven by excessive goal-directed drug choice under negative affect: translational critique of habit and compulsion theory , 2020, Neuropsychopharmacology.

[16]  Geoffrey Schoenbaum,et al.  Causal evidence supporting the proposal that dopamine transients function as temporal difference prediction errors , 2019, Nature Neuroscience.

[17]  J. Waddington,et al.  Dopaminergic mechanisms in the lateral hypothalamus regulate feeding behavior in association with neuropeptides. , 2019, Biochemical and biophysical research communications.

[18]  G. Schoenbaum,et al.  Dopamine neuron ensembles signal the content of sensory prediction errors , 2019, bioRxiv.

[19]  Ilana B. Witten,et al.  Specialized coding of sensory, motor, and cognitive variables in VTA dopamine neurons , 2019, Nature.

[20]  G. Schoenbaum,et al.  Expectancy-Related Changes in Dopaminergic Error Signals Are Impaired by Cocaine Self-Administration , 2019, Neuron.

[21]  P. Janak,et al.  Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions , 2019, Current Biology.

[22]  G. Aston-Jones,et al.  Increased Number and Activity of a Lateral Subpopulation of Hypothalamic Orexin/Hypocretin Neurons Underlies the Expression of an Addicted State in Rats , 2018, Biological Psychiatry.

[23]  Laura A. Bradfield,et al.  Inferring action-dependent outcome representations depends on anterior but not posterior medial orbitofrontal cortex , 2018, Neurobiology of Learning and Memory.

[24]  Geoffrey Schoenbaum,et al.  Rethinking dopamine as generalized prediction error , 2018, bioRxiv.

[25]  Jordan M Blacktop,et al.  Perineuronal nets in the lateral hypothalamus area regulate cue-induced reinstatement of cocaine-seeking behavior , 2018, Neuropsychopharmacology.

[26]  Matthew P. H. Gardner,et al.  Brief, But Not Prolonged, Pauses in the Firing of Midbrain Dopamine Neurons Are Sufficient to Produce a Conditioned Inhibitor , 2018, The Journal of Neuroscience.

[27]  A. Izquierdo,et al.  Persistent effect of withdrawal from intravenous methamphetamine self-administration on brain activation and behavioral economic indices involving an effort cost , 2018, Neuropharmacology.

[28]  L. Hogarth,et al.  Intact goal‐directed control in treatment‐seeking drug users indexed by outcome‐devaluation and Pavlovian to instrumental transfer: critique of habit theory , 2018, The European journal of neuroscience.

[29]  B. Balleine,et al.  Methamphetamine promotes habitual action and alters the density of striatal glutamate receptor and vesicular proteins in dorsal striatum , 2018, Addiction biology.

[30]  T. Kahnt,et al.  Identity prediction errors in the human midbrain update reward-identity expectations in the orbitofrontal cortex , 2018, Nature Communications.

[31]  Y. Niv,et al.  Model-based predictions for dopamine , 2018, Current Opinion in Neurobiology.

[32]  Matthew P. H. Gardner,et al.  Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features , 2017, Current Biology.

[33]  Hannah M. Batchelor,et al.  Dopamine Neurons Respond to Errors in the Prediction of Sensory Features of Expected Rewards , 2017, Neuron.

[34]  C. Cepeda,et al.  Basolateral Amygdala to Orbitofrontal Cortex Projections Enable Cue-Triggered Reward Expectations , 2017, The Journal of Neuroscience.

[35]  Y. Niv,et al.  Lateral Hypothalamic GABAergic Neurons Encode Reward Predictions that Are Relayed to the Ventral Tegmental Area to Regulate Learning , 2017, Current Biology.

[36]  J. Jentsch,et al.  Steep effort discounting of a preferred reward over a freely-available option in prolonged methamphetamine withdrawal in male rats , 2017, Psychopharmacology.

[37]  Heena R. Manglani,et al.  Pavlovian-to-Instrumental Transfer of Nicotine and Food Cues in Deprived Cigarette Smokers , 2017, Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco.

[38]  Miriam Sebold,et al.  When Habits Are Dangerous: Alcohol Expectancies and Habitual Decision Making Predict Relapse in Alcohol Dependence , 2017, Biological Psychiatry.

[39]  A. Reiter,et al.  Model-Based Control in Dimensional Psychiatry , 2017, Biological Psychiatry.

[40]  Joshua L. Jones,et al.  Dopamine transients are sufficient and necessary for acquisition of model-based associations , 2017, Nature Neuroscience.

[41]  Donna J. Calu,et al.  The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning , 2017, Front. Psychol..

[42]  B. Balleine,et al.  Pulling habits out of rats: adenosine 2A receptor antagonism in dorsomedial striatum rescues meth‐amphetamine‐induced deficits in goal‐directed action , 2017, Addiction biology.

[43]  P. Janak,et al.  Changes in the Influence of Alcohol-Paired Stimuli on Alcohol Seeking across Extended Training , 2016, Front. Psychiatry.

[44]  Edward H. Nieh,et al.  Inhibitory Input from the Lateral Hypothalamus to the Ventral Tegmental Area Disinhibits Dopamine Neurons and Promotes Behavioral Activation , 2016, Neuron.

[45]  R. F. Westbrook,et al.  Daily Exposure to Sucrose Impairs Subsequent Learning About Food Cues: A Role for Alterations in Ghrelin Signaling and Dopamine D2 Receptors , 2016, Neuropsychopharmacology.

[46]  R. Wise,et al.  Feeding and Reward Are Differentially Induced by Activating GABAergic Lateral Hypothalamic Projections to VTA , 2016, The Journal of Neuroscience.

[47]  K. Wassum,et al.  Nucleus accumbens core dopamine signaling tracks the need‐based motivational value of food‐paired cues , 2016, Journal of neurochemistry.

[48]  T. Robbins,et al.  Drug Addiction: Updating Actions to Habits to Compulsions Ten Years On. , 2016, Annual review of psychology.

[49]  Guillem R. Esber,et al.  Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors , 2015, Nature Neuroscience.

[50]  Laura A. Bradfield,et al.  Medial Orbitofrontal Cortex Mediates Outcome Retrieval in Partially Observable Task Situations , 2015, Neuron.

[51]  F. Clascá,et al.  Long-range projection neurons of the mouse ventral tegmental area: a single-cell axon tracing analysis , 2015, Front. Neuroanat..

[52]  M. Picciotto,et al.  GABAergic and glutamatergic efferents of the mouse ventral tegmental area , 2014, The Journal of comparative neurology.

[53]  L. Deserno,et al.  Model-Based and Model-Free Decisions in Alcohol Dependence , 2014, Neuropsychobiology.

[54]  Eric A Thrailkill,et al.  Temporal integration and instrumental conditioned reinforcement , 2014, Learning & behavior.

[55]  A. Bonci,et al.  A Critical Role of Lateral Hypothalamus in Context-Induced Relapse to Alcohol Seeking after Punishment-Imposed Abstinence , 2014, The Journal of Neuroscience.

[56]  S. Ostlund,et al.  Phasic Mesolimbic Dopamine Signaling Encodes the Facilitation of Incentive Motivation Produced by Repeated Cocaine Exposure , 2014, Neuropsychopharmacology.

[57]  S. Killcross,et al.  The prelimbic cortex contributes to the down-regulation of attention toward redundant cues. , 2014, Cerebral cortex.

[58]  P. Dayan,et al.  Model-based and model-free Pavlovian reward learning: Revaluation, revision, and revelation , 2014, Cognitive, affective & behavioral neuroscience.

[59]  R. Bodnar,et al.  Effect of dopamine D 1 and D 2 receptor antagonism in the lateral hypothalamus on the expression and acquisition of fructose-conditioned flavor preference in rats , 2014, Brain Research.

[60]  Joshua L. Jones,et al.  Disruption of model-based behavior and learning by cocaine self-administration in rats , 2013, Psychopharmacology.

[61]  B. Everitt,et al.  Addiction: failure of control over maladaptive incentive habits , 2013, Current Opinion in Neurobiology.

[62]  Josiah R. Boivin,et al.  A Causal Link Between Prediction Errors, Dopamine Neurons and Learning , 2013, Nature Neuroscience.

[63]  S. Ostlund,et al.  Repeated Cocaine Exposure Facilitates the Expression of Incentive Motivation and Induces Habitual Control in Rats , 2013, PloS one.

[64]  P. Phillips,et al.  Pavlovian valuation systems in learning and decision making , 2012, Current Opinion in Neurobiology.

[65]  G. Aston-Jones,et al.  Fos Activation of Selective Afferents to Ventral Tegmental Area during Cue-Induced Reinstatement of Cocaine Seeking in Rats , 2012, The Journal of Neuroscience.

[66]  S. Ostlund,et al.  Pavlovian-to-instrumental transfer in cocaine seeking rats. , 2012, Behavioral neuroscience.

[67]  I. McGregor,et al.  Regional c-Fos and FosB/ΔFosB expression associated with chronic methamphetamine self-administration and methamphetamine-seeking behavior in rats , 2012, Neuroscience.

[68]  L. Hogarth,et al.  Evaluating psychological markers for human nicotine dependence: tobacco choice, extinction, and Pavlovian-to-instrumental transfer. , 2012, Experimental and clinical psychopharmacology.

[69]  Anne E Carpenter,et al.  Neuron-type specific signals for reward and punishment in the ventral tegmental area , 2011, Nature.

[70]  G. McNally,et al.  The hypothalamus and the neurobiology of drug seeking , 2012, Cellular and Molecular Life Sciences.

[71]  Alice M Stamatakis,et al.  Neural correlates of Pavlovian‐to‐instrumental transfer in the nucleus accumbens shell are selectively potentiated following cocaine self‐administration , 2011, European Journal of Neuroscience.

[72]  P. Glimcher Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis , 2011, Proceedings of the National Academy of Sciences.

[73]  T. Shahan Conditioned reinforcement and response strength. , 2010, Journal of the experimental analysis of behavior.

[74]  David E. Moorman,et al.  Role of lateral hypothalamic orexin neurons in reward processing and addiction , 2009, Neuropharmacology.

[75]  R. Bodnar,et al.  Lateral hypothalamus dopamine D1-like receptors and glucose-conditioned flavor preferences in rats , 2009, Neurobiology of Learning and Memory.

[76]  K. Deisseroth,et al.  Phasic Firing in Dopaminergic Neurons Is Sufficient for Behavioral Conditioning , 2009, Science.

[77]  G. McNally,et al.  Lateral Hypothalamus Is Required for Context-Induced Reinstatement of Extinguished Reward Seeking , 2009, The Journal of Neuroscience.

[78]  M. Scott Bowers,et al.  Cocaine but Not Natural Reward Self-Administration nor Passive Cocaine Infusion Produces Persistent LTP in the VTA , 2008, Neuron.

[79]  Geoffrey Schoenbaum,et al.  The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards , 2008, Nature.

[80]  G. McNally,et al.  Renewal of extinguished cocaine-seeking , 2008, Neuroscience.

[81]  James Robert Brašić,et al.  Persistent cognitive and dopamine transporter deficits in abstinent methamphetamine users , 2008, Synapse.

[82]  M. Roesch,et al.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[83]  G. Schoenbaum,et al.  Conditioned Reinforcement can be Mediated by Either Outcome-Specific or General Affective Representations , 2007, Frontiers in integrative neuroscience.

[84]  P. Holland,et al.  Reinforcer-specificity of appetitive and consummatory behavior of rats after Pavlovian conditioning with food reinforcers , 2007, Physiology & Behavior.

[85]  G. McNally,et al.  The neural correlates and role of D1 dopamine receptors in renewal of extinguished alcohol-seeking , 2007, Neuroscience.

[86]  P. Janak,et al.  Ethanol-associated cues produce general pavlovian-instrumental transfer. , 2007, Alcoholism: Clinical and Experimental Research.

[87]  N. Volkow,et al.  Cocaine Cues and Dopamine in Dorsal Striatum: Mechanism of Craving in Cocaine Addiction , 2006, The Journal of Neuroscience.

[88]  S. Killcross,et al.  Amphetamine Exposure Enhances Habit Formation , 2006, The Journal of Neuroscience.

[89]  B. Balleine,et al.  Effects of outcome devaluation on the performance of a heterogeneous instrumental chain , 2005 .

[90]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[91]  T. Robbins,et al.  Neural systems of reinforcement for drug addiction: from actions to habits to compulsion , 2005, Nature Neuroscience.

[92]  G. Aston-Jones,et al.  A role for lateral hypothalamic orexin neurons in reward seeking , 2005, Nature.

[93]  G. Koob,et al.  Gene expression evidence for remodeling of lateral hypothalamic circuitry in cocaine addiction. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[94]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[95]  D. Segal,et al.  Prolonged exposure of rats to intravenous methamphetamine: behavioral and neurochemical characterization , 2005, Psychopharmacology.

[96]  Y. Shaham,et al.  Incubation of cocaine craving after withdrawal: a review of preclinical data , 2004, Neuropharmacology.

[97]  N. Volkow,et al.  Loss of Dopamine Transporters in Methamphetamine Abusers Recovers with Protracted Abstinence , 2001, The Journal of Neuroscience.

[98]  K. Berridge,et al.  Incentive Sensitization by Previous Amphetamine Exposure: Increased Cue-Triggered “Wanting” for Sucrose Reward , 2001, The Journal of Neuroscience.

[99]  K. Berridge,et al.  Intra-Accumbens Amphetamine Increases the Conditioned Incentive Salience of Sucrose Reward: Enhancement of Reward “Wanting” without Enhanced “Liking” or Response Reinforcement , 2000, The Journal of Neuroscience.

[100]  D. Wong,et al.  Reduced Striatal Dopamine Transporter Density in Abstinent Methamphetamine and Methcathinone Users: Evidence from Positron Emission Tomography Studies with [11C]WIN-35,428 , 1998, The Journal of Neuroscience.

[101]  B. Balleine,et al.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[102]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[103]  R. Colwill,et al.  Encoding of the unconditioned stimulus in Pavlovian conditioning , 1994 .

[104]  B A Williams,et al.  Conditioned Reinforcement: Experimental and Theoretical Issues , 1994, The Behavior analyst.

[105]  L. Hernández,et al.  Ventromedial hypothalamus vs. lateral hypothalamic D2 satiety receptors in the body weight increase induced by systemic sulpiride , 1991, Physiology & Behavior.

[106]  B. Hoebel,et al.  Dopamine in the lateral hypothalamus may be involved in the inhibition of locomotion related to food and water seeking , 1990, Brain Research Bulletin.

[107]  B. Hoebel,et al.  Sulpiride injections in the lateral hypothalamus induce feeding and drinking in rats , 1988, Pharmacology Biochemistry and Behavior.

[108]  D. C. Howell Statistical Methods for Psychology , 1987 .

[109]  R. Rescorla A Pavlovian Analysis of Goal-Directed Behavior. , 1987 .

[110]  W. Nauta,et al.  Efferent connections of the substantia nigra and ventral tegmental area in the rat , 1979, Brain Research.

[111]  T. S. Hyde The effect of Pavlovian stimuli on the acquisition of a new response , 1976 .

[112]  P. Teitelbaum,et al.  Hypothalamic Control of Feeding and Self-Stimulation , 1962, Science.

[113]  P. Morgane Distinct "Feeding" and "Hunger Motivating" Systems in the Lateral Hypothalamus of the Rat , 1961, Science.

[114]  J. Brobeck,et al.  Localization of a “Feeding Center” in the Hypothalamus of the Rat , 1951, Proceedings of the Society for Experimental Biology and Medicine. Society for Experimental Biology and Medicine.