Choosing and learning: outcome valence differentially affects learning from free versus forced choices

Positivity bias refers to learning more from positive than negative events. This learning asymmetry could either reflect a preference for positive events in general, or be the upshot of a more general, and perhaps, ubiquitous, “choice-confirmation” bias, whereby agents preferentially integrate information that confirms their previous decision. We systematically compared these two theories with 3 experiments mixing free- and forced-choice conditions, featuring factual and counterfactual learning and varying action requirements across “go” and “no-go” trials. Computational analyses of learning rates showed clear and robust evidence in favour of the “choice-confirmation” theory: participants amplified positive prediction errors in free-choice conditions while being valence-neutral on forced-choice conditions. We suggest that a choice-confirmation bias is adaptive to the extent that it reinforces actions that are most likely to meet an individual’s needs, i.e. freely chosen actions. In contrast, outcomes from unchosen actions are more likely to be treated impartially, i.e. to be assigned no special value in self-determined decisions.

[1]  J. Brehm Postdecision changes in the desirability of alternatives. , 1956, Journal of abnormal psychology.

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  M. Seligman,et al.  Learned helplessness in humans: critique and reformulation. , 1978, Journal of abnormal psychology.

[4]  R. Nickerson Confirmation Bias: A Ubiquitous Phenomenon in Many Guises , 1998 .

[5]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[6]  Choice-supportive source monitoring: do our decisions seem better to us as we age? , 2000 .

[7]  Marcia K. Johnson,et al.  Misremembrance of Options Past: Source Monitoring and Choice , 2000, Psychological science.

[8]  Mara Mather,et al.  Remembering chosen and assigned options , 2003, Memory & cognition.

[9]  R. Passingham,et al.  Attention to Intention , 2004, Science.

[10]  Clay B. Holroyd,et al.  ERP correlates of feedback and reward processing in the presence and absence of response choice. , 2005, Cerebral cortex.

[11]  Rongjun Yu,et al.  Brain responses to outcomes of one's own and other's performance in a gambling task , 2006, Neuroreport.

[12]  Thilo Van Eimeren,et al.  Implementation of visuospatial cues in response selection , 2006, NeuroImage.

[13]  Michael J. Frank,et al.  Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[14]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[15]  W. Prinz,et al.  Dissociating What and When of Intentional Actions , 2009, Front. Hum. Neurosci..

[16]  Tali Sharot,et al.  Do Decisions Shape Preference? , 2010, Psychological science.

[17]  Norihiro Sadato,et al.  Neural correlates of cognitive dissonance and choice-induced preference change , 2010, Proceedings of the National Academy of Sciences.

[18]  W. Schultz,et al.  Neural mechanisms of observational learning , 2010, Proceedings of the National Academy of Sciences.

[19]  Neal J. Cohen,et al.  Hippocampal brain-network coordination during volitional exploratory behavior enhances learning , 2010, Nature Neuroscience.

[20]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[21]  T. Robbins,et al.  Decision Making, Affect, and Learning: Attention and Performance XXIII , 2011 .

[22]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[23]  M. Delgado,et al.  The Inherent Reward of Choice , 2011, Psychological science.

[24]  P. Dayan,et al.  Opponency Revisited: Competition and Cooperation Between Dopamine and Serotonin , 2010, Neuropsychopharmacology.

[25]  John P. O'Doherty,et al.  Human Dorsal Striatum Encodes Prediction Errors during Observational Learning of Instrumental Actions , 2012, Journal of Cognitive Neuroscience.

[26]  I. Daum,et al.  The neural coding of expected and unexpected monetary performance outcomes: Dissociations between active and observational learning , 2012, Behavioural Brain Research.

[27]  P. Dayan,et al.  Neural Prediction Errors Reveal a Risk-Sensitive Reinforcement-Learning Process in the Human Brain , 2012, The Journal of Neuroscience.

[28]  Romain D. Cazé,et al.  Adaptive properties of differential learning rates for positive and negative outcomes , 2013, Biological Cybernetics.

[29]  C. Keysers,et al.  Vicarious Neural Processing of Outcomes during Observational Learning , 2013, PloS one.

[30]  Wim Fias,et al.  Brain correlates of subjective freedom of choice , 2013, Consciousness and Cognition.

[31]  Lionel Rigoux,et al.  VBA: A Probabilistic Treatment of Nonlinear Models for Neurobiological and Behavioural Data , 2014, PLoS Comput. Biol..

[32]  L. Davachi,et al.  The Simple Act of Choosing Influences Declarative Memory , 2015, The Journal of Neuroscience.

[33]  M. Khamassi,et al.  Contextual modulation of value signals in reward and punishment learning , 2015, Nature Communications.

[34]  Samuel J Gershman,et al.  Do learning rates adapt to the distribution of rewards? , 2015, Psychonomic bulletin & review.

[35]  E. Deci,et al.  How self-determined choice facilitates performance: a key role of the ventromedial prefrontal cortex. , 2015, Cerebral cortex.

[36]  T. Sharot,et al.  Forming Beliefs: Why Valence Matters , 2016, Trends in Cognitive Sciences.

[37]  S. Schwartz,et al.  Linking Individual Learning Styles to Approach-Avoidance Motivational Traits and Computational Aspects of Reinforcement Learning , 2016, PloS one.

[38]  Axel Cleeremans,et al.  Coercion Changes the Sense of Agency in the Human Brain , 2016, Current Biology.

[39]  Optimistic Belief Updating Deviates from Bayesian Learning , 2016 .

[40]  W. Schultz,et al.  Partial Adaptation of Obtained and Observed Value Signals Preserves Information about Gains and Losses , 2016, The Journal of Neuroscience.

[41]  M. Pessiglione,et al.  A specific role for serotonin in overcoming effort cost , 2016, eLife.

[42]  J. Rotter Social learning and clinical psychology , 2017 .

[43]  Stefano Palminteri,et al.  Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing , 2016, PLoS Comput. Biol..

[44]  M. Lebreton,et al.  Behavioural and neural characterization of optimistic reinforcement learning , 2017, Nature Human Behaviour.

[45]  E. Koechlin,et al.  The Importance of Falsification in Computational Cognitive Modeling , 2017, Trends in Cognitive Sciences.

[46]  Etienne Koechlin,et al.  Believing in one’s power: a counterfactual heuristic for goal-directed control , 2018, bioRxiv.

[47]  D. Benjamin,et al.  Errors in Probabilistic Reasoning and Judgment Biases , 2018 .

[48]  Kentaro Katahira,et al.  The statistical structures of reinforcement learning with asymmetric value updates , 2018, Journal of Mathematical Psychology.

[49]  Michael X. Cohen,et al.  How the Level of Reward Awareness Changes the Computational and Electrophysiological Signatures of Reinforcement Learning , 2018, The Journal of Neuroscience.

[50]  Errors in Probabilistic Reasoning and Judgment Biases , 2018 .

[51]  S. Gershman How to never be wrong , 2018, Psychonomic Bulletin & Review.

[52]  Brent L. Hughes,et al.  Causal Inference About Good and Bad Outcomes , 2019, Psychological science.