Reinforcement Learning Models of Human Behavior: Reward Processing in Mental Disorders

Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. For AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Empirically, the proposed model outperforms Q-Learning and Double Q-Learning in artificial scenarios with certain reward distributions and real-world human decision making gambling tasks. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in long-term recommendation systems.

[1]  Susanne Becker,et al.  Mesolimbic dopamine signaling in acute and chronic pain: implications for motivation, analgesia, and addiction , 2016, Pain.

[2]  Ariel Haimovici,et al.  Beyond pain: modeling decision-making deficits in chronic pain , 2014, Front. Behav. Neurosci..

[3]  Arno Villringer,et al.  Iowa Gambling Task: There is More to Consider than Long-Term Outcome. Using a Linear Equation Model to Disentangle the Impact of Outcome and Frequency of Gains and Losses , 2012, Front. Neurosci..

[4]  P. Glimcher,et al.  Phasic Dopamine Release in the Rat Nucleus Accumbens Symmetrically Encodes a Reward Prediction Error Term , 2014, The Journal of Neuroscience.

[5]  James L. McClelland,et al.  Data from 617 Healthy Participants Performing the Iowa Gambling Task: A “Many Labs” Collaboration , 2015 .

[6]  H. Geurts,et al.  Does reward frequency or magnitude drive reinforcement-learning in attention-deficit/hyperactivity disorder? , 2009, Psychiatry Research.

[7]  R. Dolan,et al.  Computational Psychiatry of ADHD: Neural Gain Impairments across Marrian Levels of Analysis , 2016, Trends in Neurosciences.

[8]  Jadin C. Jackson,et al.  Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. , 2007, Psychological review.

[9]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[10]  A. Holmes,et al.  The Myth of Optimality in Clinical Neuroscience , 2018, Trends in Cognitive Sciences.

[11]  A. Tversky,et al.  The framing of decisions and the psychology of choice. , 1981, Science.

[12]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[13]  Michael J. Frank,et al.  A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. , 2006, Behavioral neuroscience.

[14]  Woojae Kim,et al.  Cognitive Mechanisms Underlying Risky Decision-Making in Chronic Cannabis Users. , 2010, Journal of mathematical psychology.

[15]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[16]  R. Dolan,et al.  The neurobiology of punishment , 2007, Nature Reviews Neuroscience.

[17]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[18]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[19]  P. Dayan,et al.  Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[20]  Caro Lucas,et al.  A Neurocomputational Model for Cocaine Addiction , 2009, Neural Computation.

[21]  J. Kramer,et al.  Reward processing in neurodegenerative disease , 2015, Neurocase.

[22]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[23]  M. Frank,et al.  From reinforcement learning models to psychiatric and neurological disorders , 2011, Nature Neuroscience.

[24]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[25]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[26]  A. Damasio,et al.  Insensitivity to future consequences following damage to human prefrontal cortex , 1994, Cognition.