Simulating future value in intertemporal choice

The laboratory study of how humans and other animals trade-off value and time has a long and storied history, and is the subject of a vast literature. However, despite a long history of study, there is no agreed upon mechanistic explanation of how intertemporal choice preferences arise. Several theorists have recently proposed model-based reinforcement learning as a candidate framework. This framework describes a suite of algorithms by which a model of the environment, in the form of a state transition function and reward function, can be converted on-line into a decision. The state transition function allows the model-based system to make decisions based on projected future states, while the reward function assigns value to each state, together capturing the necessary components for successful intertemporal choice. Empirical work has also pointed to a possible relationship between increased prospection and reduced discounting. In the current paper, we look for direct evidence of a relationship between temporal discounting and model-based control in a large new data set (n = 168). However, testing the relationship under several different modeling formulations revealed no indication that the two quantities are related.

[1]  Gordon D. A. Brown,et al.  Decision by sampling , 2006, Cognitive Psychology.

[2]  David Laibson,et al.  Money Earlier or Later? Simple Heuristics Explain Intertemporal Choices Better Than Delay Discounting Does , 2015, Psychological science.

[3]  N. Daw,et al.  Multiple memory systems as substrates for multiple decision systems , 2015, Neurobiology of Learning and Memory.

[4]  Trevor W. Robbins,et al.  High Impulsivity Predicts the Switch to Compulsive Cocaine-Taking , 2008, Science.

[5]  Yaacov Trope,et al.  Temporal construal. , 2003, Psychological review.

[6]  Michael J. Frank,et al.  Faculty Opinions recommendation of States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. , 2010 .

[7]  V. Walsh,et al.  Bi-frontal direct current stimulation affects delay discounting choices , 2013, Cognitive neuroscience.

[8]  Dylan A. Simon,et al.  Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[9]  George Ainslie,et al.  Frontoparietal cortical activity of methamphetamine‐dependent and comparison subjects performing a delay discounting task , 2007, Human brain mapping.

[10]  G. Loewenstein Anticipation and the Valuation of Delayed Consumption , 1987 .

[11]  Eric J. Johnson,et al.  Lateral prefrontal cortex and self-control in intertemporal choice , 2010, Nature Neuroscience.

[12]  Alec Solway,et al.  Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. , 2012, Psychological review.

[13]  Nathaniel D. Daw,et al.  Cognitive Control Predicts Use of Model-based Reinforcement Learning , 2014, Journal of Cognitive Neuroscience.

[14]  M. Frank,et al.  Instructional control of reinforcement learning: A behavioral and neurocomputational investigation , 2009, Brain Research.

[15]  R. Benoit,et al.  A Neural Mechanism Mediating the Impact of Episodic Prospection on Farsighted Decisions , 2011, The Journal of Neuroscience.

[16]  G. Loewenstein,et al.  Time Discounting and Time Preference: A Critical Review , 2002 .

[17]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[18]  Z. Kurth-Nelson,et al.  A theoretical account of cognitive effects in delay discounting , 2012, The European journal of neuroscience.

[19]  L. Green,et al.  Discounting of delayed rewards: Models of individual choice. , 1995, Journal of the experimental analysis of behavior.

[20]  Peter Dayan,et al.  Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees , 2012, PLoS Comput. Biol..

[21]  S. Houle,et al.  Investing in the Future: Stimulation of the Medial Prefrontal Cortex Reduces Discounting of Delayed Rewards , 2015, Neuropsychopharmacology.

[22]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[23]  A. Odum,et al.  Impulsivity and cigarette smoking: delay discounting in current, never, and ex-smokers , 1999, Psychopharmacology.

[24]  Samuel M. McClure,et al.  The Decimal Effect: Behavioral and Neural Bases for a Novel Influence on Intertemporal Choice in Healthy Individuals and in ADHD , 2014, Journal of Cognitive Neuroscience.

[25]  Robert C. Wolpert,et al.  A Review of the , 1985 .

[26]  Mikhail N. Koffarnus,et al.  Excessive discounting of delayed reinforcers as a trans-disease process contributing to addiction and other disease-related vulnerabilities: emerging evidence. , 2012, Pharmacology & therapeutics.

[27]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[28]  R. Dolan,et al.  Dopamine, Time, and Impulsivity in Humans , 2010, The Journal of Neuroscience.

[29]  L. Epstein,et al.  Living in the moment: effects of time perspective and emotional valence of episodic thinking on delay discounting. , 2014, Behavioral neuroscience.

[30]  P. Glimcher,et al.  The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.

[31]  Eric J. Johnson,et al.  Mindful judgment and decision making. , 2009, Annual review of psychology.

[32]  P. Dayan,et al.  Disorders of compulsivity: a common bias towards learning habits , 2014, Molecular Psychiatry.

[33]  Jan Peters,et al.  Episodic Future Thinking Reduces Reward Delay Discounting through an Enhancement of Prefrontal-Mediotemporal Interactions , 2010, Neuron.

[34]  A. Odum Delay discounting: Trait variable? , 2011, Behavioural Processes.

[35]  M. Verfaellie,et al.  The medial temporal lobes are critical for reward‐based decision making under conditions that promote episodic future thinking , 2015, Hippocampus.

[36]  G. Loewenstein,et al.  Projection Bias in Predicting Future Utility , 2000 .

[37]  Jonathan W. Leland SIMILARITY JUDGMENTS AND ANOMALIES IN INTERTEMPORAL CHOICE , 2002 .

[38]  James J Gross,et al.  The Hidden-Zero Effect , 2008, Psychological science.

[39]  G. Pezzulo,et al.  The Value of Foresight: How Prospection Affects Decision-Making , 2011, Front. Neurosci..

[40]  P. Samuelson A Note on Measurement of Utility , 1937 .

[41]  G. Loewenstein,et al.  Anomalies in Intertemporal Choice: Evidence and an Interpretation , 1992 .

[42]  K. Kirby One-year temporal stability of delay-discount rates , 2009, Psychonomic bulletin & review.

[43]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[44]  Peter D. Sozou,et al.  On hyperbolic discounting and uncertain hazard rates , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[45]  J. Ravetz Experiencing the future , 1999 .

[46]  Evidence and interpretation , 2017 .

[47]  Z. Kurth-Nelson,et al.  A Reinforcement Learning Model of Precommitment in Decision Making , 2010, Front. Behav. Neurosci..

[48]  Jennifer M. Mitchell,et al.  Dopamine, Corticostriatal Connectivity, and Intertemporal Choice , 2012, The Journal of Neuroscience.

[49]  David S. Touretzky,et al.  Behavioral considerations suggest an average reward TD model of the dopamine system , 2000, Neurocomputing.

[50]  G. Schoenbaum,et al.  Transition from ‘model-based’ to ‘model-free’ behavioral control in addiction: Involvement of the orbitofrontal cortex and dorsolateral striatum , 2014, Neuropharmacology.

[51]  D. Gilbert,et al.  Prospection: Experiencing the Future , 2007, Science.

[52]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[53]  J. Smallwood,et al.  Letting go of the present: Mind-wandering is associated with reduced delay discounting , 2013, Consciousness and Cognition.

[54]  D. Goldstein,et al.  Word count: 3998 Corresponding author: , 2022 .

[55]  R. Dolan,et al.  Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making , 2015, Proceedings of the National Academy of Sciences.

[56]  Alice Y. Chiang,et al.  Working-memory capacity protects model-based learning from stress , 2013, Proceedings of the National Academy of Sciences.

[57]  Neil Stewart,et al.  How to Make Loss Aversion Disappear and Reverse: Tests of the Decision by Sampling Origin of Loss Aversion , 2014, Journal of experimental psychology. General.

[58]  Scott A. Huettel,et al.  Functional Neuroimaging of Intertemporal Choice Models: A Review , 2010 .

[59]  Giles W. Story,et al.  Does temporal discounting explain unhealthy behavior? A systematic review and reinforcement learning perspective , 2014, Front. Behav. Neurosci..

[60]  S. Hyman The Neurobiology of Addiction: Implications for Voluntary Control of Behavior , 2007, The American journal of bioethics : AJOB.

[61]  Joel Myerson,et al.  Cross-Cultural Comparisons of Discounting Delayed and Probabilistic Rewards , 2002 .

[62]  N. Petry,et al.  Excessive discounting of delayed rewards in substance abusers with gambling problems. , 1999, Drug and alcohol dependence.

[63]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[64]  Zeb Kurth-Nelson,et al.  Temporal-Difference Reinforcement Learning with Distributed Representations , 2009, PloS one.

[65]  N. Daw,et al.  Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning , 2016, The Journal of Neuroscience.

[66]  Jeffrey R. Stevens,et al.  Mechanisms for decisions about the future , 2011 .

[67]  Dylan A. Simon,et al.  Model-based choices involve prospective neural activity , 2015, Nature Neuroscience.

[68]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[69]  Anastasia Christakou,et al.  The role of simulation in intertemporal choices , 2015, Front. Neurosci..

[70]  Lorena R. R. Gianotti,et al.  Why Some People Discount More than Others: Baseline Activation in the Dorsal PFC Mediates the Link between COMT Genotype and Impatient Choice , 2012, Front. Neurosci..

[71]  Daphna Shohamy,et al.  Dopamine Modulation of Intertemporal Decision-making: Evidence from Parkinson Disease , 2016, Journal of Cognitive Neuroscience.

[72]  N. Daw,et al.  Dissociating hippocampal and striatal contributions to sequential prediction learning , 2012, The European journal of neuroscience.

[73]  A. Strafella,et al.  Continuous theta burst stimulation of right dorsolateral prefrontal cortex induces changes in impulsivity level , 2010, Brain Stimulation.

[74]  Thomas H. B. FitzGerald,et al.  Disruption of Dorsolateral Prefrontal Cortex Decreases Model-Based in Favor of Model-free Control in Humans , 2013, Neuron.

[75]  J. Avery Critical review. , 2006, The Journal of the Arkansas Medical Society.

[76]  S. Black,et al.  Cueing the personal future to reduce discounting in intertemporal choice: Is episodic prospection necessary? , 2015, Hippocampus.

[77]  Thomas Suddendorf,et al.  Prospection and the Present Moment: The Role of Episodic Foresight in Intertemporal Choices between Immediate and Delayed Rewards , 2016 .

[78]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[79]  Adriana M. Seelye,et al.  Discounting of delayed rewards and executive dysfunction in individuals infected with hepatitis C , 2011, Journal of clinical and experimental neuropsychology.

[80]  M. Botvinick,et al.  Evidence integration in model-based tree search , 2015, Proceedings of the National Academy of Sciences.

[81]  P. Dayan,et al.  Mapping value based planning and extensively trained choice in the human brain , 2012, Nature Neuroscience.

[82]  Mathias Pessiglione,et al.  A Critical Role for the Hippocampus in the Valuation of Imagined Outcomes , 2013, PLoS biology.

[83]  W K Bickel,et al.  Impulsive and self-control choices in opioid-dependent patients and non-drug-using control participants: drug and monetary rewards. , 1997, Experimental and clinical psychopharmacology.

[84]  G. Loewenstein Out of control: Visceral influences on behavior , 1996 .

[85]  R. Dolan,et al.  Dopamine Enhances Model-Based over Model-Free Choice Behavior , 2012, Neuron.

[86]  D. Stephens,et al.  The adaptive nature of impulsivity. , 2010 .

[87]  Mikhail N. Koffarnus,et al.  The behavioral- and neuro-economic process of temporal discounting: A candidate behavioral marker of addiction , 2014, Neuropharmacology.