Is Avoiding an Aversive Outcome Rewarding? Neural Substrates of Avoidance Learning in the Human Brain

Avoidance learning poses a challenge for reinforcement-based theories of instrumental conditioning, because once an aversive outcome is successfully avoided an individual may no longer experience extrinsic reinforcement for their behavior. One possible account for this is to propose that avoiding an aversive outcome is in itself a reward, and thus avoidance behavior is positively reinforced on each trial when the aversive outcome is successfully avoided. In the present study we aimed to test this possibility by determining whether avoidance of an aversive outcome recruits the same neural circuitry as that elicited by a reward itself. We scanned 16 human participants with functional MRI while they performed an instrumental choice task, in which on each trial they chose from one of two actions in order to either win money or else avoid losing money. Neural activity in a region previously implicated in encoding stimulus reward value, the medial orbitofrontal cortex, was found to increase, not only following receipt of reward, but also following successful avoidance of an aversive outcome. This neural signal may itself act as an intrinsic reward, thereby serving to reinforce actions during instrumental avoidance.

[1]  W. Brown Animal Intelligence: Experimental Studies , 1912, Nature.

[2]  B. Skinner Two Types of Conditioned Reflex and a Pseudo Type , 1935 .

[3]  O. Mowrer On the dual nature of learning—a re-interpretation of "conditioning" and "problem-solving." , 1947 .

[4]  H. Page,et al.  Experimental extinction as a function of the prevention of a response. , 1953, Journal of comparative and physiological psychology.

[5]  R. Solomon,et al.  Traumatic avoidance learning: Acquisition in normal dogs. , 1953 .

[6]  R. Rescorla,et al.  INHIBITION OF AVOIDANCE BEHAVIOR. , 1965, Journal of comparative and physiological psychology.

[7]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[8]  J. C. Johnston,et al.  A cognitive theory of avoidance learning. , 1973 .

[9]  F. J. McGuigan,et al.  Contemporary approaches to conditioning and learning , 1973 .

[10]  R. Solomon,et al.  An opponent-process theory of motivation. I. Temporal dynamics of affect. , 1974, Psychological review.

[11]  R. Morris Preconditioning of reinforcing properties to an exteroceptive feedback stimulus , 1975 .

[12]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[13]  R. Solomon,et al.  An Opponent-Process Theory of Motivation , 1978 .

[14]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[15]  A. Dickinson Conditioning and associative learning. , 1981, British medical bulletin.

[16]  S. Grossberg,et al.  Neural dynamics of decision making under risk: affective balance and cognitive-emotional interactions. , 1988, Psychological review.

[17]  R. Dantzer The Psychology of Fear and Stress, J.A. Gray (Ed.). Cambridge University Press, Cambridge (1987), viii and 422 pp, ISBN 0-521-27098-7 , 1989 .

[18]  Karl J. Friston,et al.  Value-dependent selection in the brain: Simulation in a synthetic neural model , 1994, Neuroscience.

[19]  A. Damasio,et al.  Insensitivity to future consequences following damage to human prefrontal cortex , 1994, Cognition.

[20]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[21]  A. Damasio,et al.  Emotion, decision making and the orbitofrontal cortex. , 2000, Cerebral cortex.

[22]  L. Nystrom,et al.  Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[23]  J. Wickens,et al.  A cellular mechanism of reward-related learning , 2001, Nature.

[24]  M. Raichle,et al.  Emotion-induced changes in human medial prefrontal cortex: II. During anticipatory anxiety. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  E. Rolls,et al.  Abstract reward and punishment representations in the human orbitofrontal cortex , 2001, Nature Neuroscience.

[26]  Alan C. Evans,et al.  Changes in brain activity related to eating chocolate: from pleasure to aversion. , 2001, Brain : a journal of neurology.

[27]  Brian Knutson,et al.  Parametric FMRI confirms selective recruitment of nucleus accumbens during anticipation of monetary reward , 2001, NeuroImage.

[28]  R. Zatorre,et al.  Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[29]  E. Rolls,et al.  Representation of pleasant and aversive taste in the human brain. , 2001, Journal of neurophysiology.

[30]  Brian Knutson,et al.  Anticipation of Increasing Monetary Reward Selectively Recruits Nucleus Accumbens , 2001, The Journal of Neuroscience.

[31]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[32]  S. Kapur,et al.  Direct Activation of the Ventral Striatum in Anticipation of Aversive Stimuli , 2003, Neuron.

[33]  R Turner,et al.  Optimized EPI for fMRI studies of the orbitofrontal cortex , 2003, NeuroImage.

[34]  Brian Knutson,et al.  A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI , 2003, NeuroImage.

[35]  G. Glover,et al.  Dissociated neural representations of intensity and valence in human olfaction , 2003, Nature Neuroscience.

[36]  G. Schoenbaum,et al.  Encoding Predicted Outcome and Acquired Value in Orbitofrontal Cortex during Cue Sampling Depends upon Input from Basolateral Amygdala , 2003, Neuron.

[37]  K. Berridge,et al.  Erratum to: “Parsing reward” [Trends Neurosci. 26 (2003) 507–513] , 2003, Trends in Neurosciences.

[38]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[39]  Tom Johnstone,et al.  Inverse amygdala and medial prefrontal cortex responses to surprised faces , 2003, Neuroreport.

[40]  R. Elliott,et al.  Differential Response Patterns in the Striatum and Orbitofrontal Cortex to Financial Reward in Humans: A Parametric Functional Magnetic Resonance Imaging Study , 2003, The Journal of Neuroscience.

[41]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[42]  J. O'Doherty,et al.  Dissociating Valence of Outcome from Behavioral Control in Human Orbital and Ventral Prefrontal Cortices , 2003, The Journal of Neuroscience.

[43]  J. Parkinson,et al.  Dissociable Contributions of the Human Amygdala and Orbitofrontal Cortex to Incentive Motivation and Goal Selection , 2003, The Journal of Neuroscience.

[44]  K. Berridge,et al.  Parsing reward , 2003, Trends in Neurosciences.

[45]  Saori C. Tanaka,et al.  Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.

[46]  N. Ramnani,et al.  Distinct portions of anterior cingulate cortex and medial prefrontal cortex are activated by reward processing in separable phases of decision-making cognition , 2004, Biological Psychiatry.

[47]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[48]  P. Holland,et al.  Amygdala–frontal interactions and reward expectancy , 2004, Current Opinion in Neurobiology.

[49]  Peter Dayan,et al.  Temporal difference models describe higher-order learning in humans , 2004, Nature.

[50]  Karl J. Friston,et al.  Opponent appetitive-aversive neural processes underlie predictive learning of pain relief , 2005, Nature Neuroscience.

[51]  M. Roesch,et al.  Orbitofrontal Cortex, Associative Learning, and Expectancies , 2005, Neuron.

[52]  Marcus E Raichle,et al.  Intrinsic brain activity sets the stage for expression of motivated behavior , 2005, The Journal of comparative neurology.

[53]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[54]  J. O'Doherty,et al.  Regret and its avoidance: a neuroimaging study of choice behavior , 2005, Nature Neuroscience.

[55]  R. Deichmann,et al.  Optimized EPI for fMRI studies of the orbitofrontal cortex: compensation of susceptibility-induced gradients in the readout direction , 2007, Magnetic Resonance Materials in Physics, Biology and Medicine.

[56]  J. O'Doherty,et al.  Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum , 2006, Neuron.