Observing others stay or switch – How social prediction errors are integrated into reward reversal learning

Reward properties of stimuli can undergo sudden changes, and the detection of these 'reversals' is often made difficult by the probabilistic nature of rewards/punishments. Here we tested whether and how humans use social information (someone else's choices) to overcome uncertainty during reversal learning. We show a substantial social influence during reversal learning, which was modulated by the type of observed behavior. Participants frequently followed observed conservative choices (no switches after punishment) made by the (fictitious) other player but ignored impulsive choices (switches), even though the experiment was set up so that both types of response behavior would be similarly beneficial/detrimental (Study 1). Computational modeling showed that participants integrated the observed choices as a 'social prediction error' instead of ignoring or blindly following the other player. Modeling also confirmed higher learning rates for 'conservative' versus 'impulsive' social prediction errors. Importantly, this 'conservative bias' was boosted by interpersonal similarity, which in conjunction with the lack of effects observed in a non-social control experiment (Study 2) confirmed its social nature. A third study suggested that relative weighting of observed impulsive responses increased with increased volatility (frequency of reversals). Finally, simulations showed that in the present paradigm integrating social and reward information was not necessarily more adaptive to maximize earnings than learning from reward alone. Moreover, integrating social information increased accuracy only when conservative and impulsive choices were weighted similarly during learning. These findings suggest that to guide decisions in choice contexts that involve reward reversals humans utilize social cues conforming with their preconceptions more strongly than cues conflicting with them, especially when the other is similar.

[1]  M. Deutsch,et al.  A study of normative and informational social influences upon individual judgement. , 1955, Journal of abnormal psychology.

[2]  S. Asch Studies of independence and conformity: I. A minority of one against a unanimous majority. , 1956 .

[3]  Björn Lindström,et al.  Demonstrator skill modulates observational aversive learning , 2014, Cognition.

[4]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[5]  T. Robbins,et al.  Defining the Neural Mechanisms of Probabilistic Reversal Learning Using Event-Related Functional Magnetic Resonance Imaging , 2002, The Journal of Neuroscience.

[6]  J. Burger,et al.  What a Coincidence! The Effects of Incidental Similarity on Compliance , 2004, Personality & social psychology bulletin.

[7]  H. Siegelmann,et al.  ASSOCIATIVE LEARNING , 2017 .

[8]  Dale T. Miller,et al.  Norm theory: Comparing reality to its alternatives , 1986 .

[9]  R. Nickerson Confirmation Bias: A Ubiquitous Phenomenon in Many Guises , 1998 .

[10]  M. Mishkin,et al.  Limbic lesions and the problem of stimulus--reinforcement associations. , 1972, Experimental neurology.

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  Markus Ullsperger,et al.  Adaptive Coding of Action Values in the Human Rostral Cingulate Zone , 2009, The Journal of Neuroscience.

[13]  Mark W Woolrich,et al.  Associative learning of social value , 2008, Nature.

[14]  Susan T. Fiske,et al.  Neural regions that underlie reinforcement learning are also active for social expectancy violations , 2010, Social neuroscience.

[15]  John C. Turner,et al.  The creation of uncertainty in the influence process: The roles of stimulus information and disagreement with similar others , 1993 .

[16]  Edward T Bullmore,et al.  Probing Compulsive and Impulsive Behaviors, from Animal Models to Endophenotypes: A Narrative Review , 2010, Neuropsychopharmacology.

[17]  HighWire Press Philosophical Transactions of the Royal Society of London , 1781, The London Medical Journal.

[18]  J. O'Doherty,et al.  The Role of the Ventromedial Prefrontal Cortex in Abstract State-Based Inference during Decision Making in Humans , 2006, The Journal of Neuroscience.

[19]  H. Akaike A new look at the statistical model identification , 1974 .

[20]  J. Richards,et al.  When Gulliver travels : social context, psychological closeness, and self-appraisals , 1992 .

[21]  Björn Lindström,et al.  Mechanisms of social avoidance learning can explain the emergence of adaptive and arbitrary behavioral traditions in humans. , 2015, Journal of experimental psychology. General.

[22]  W. Schultz,et al.  Neural mechanisms of observational learning , 2010, Proceedings of the National Academy of Sciences.

[23]  L. Festinger Informal social communication. , 1950, Psychological review.

[24]  Robert S. Baron,et al.  The forgotten variable in conformity research: Impact of task importance on social influence. , 1996 .

[25]  M. Frank,et al.  Dopaminergic Genes Predict Individual Differences in Susceptibility to Confirmation Bias , 2011, The Journal of Neuroscience.

[26]  Angela J. Yu,et al.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[27]  Axel Ockenfels,et al.  Similarity increases altruistic punishment in humans , 2013, Proceedings of the National Academy of Sciences.

[28]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[29]  G. Gigerenzer How to Make Cognitive Illusions Disappear: Beyond “Heuristics and Biases” , 1991 .

[30]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[31]  John P. O'Doherty,et al.  Human Dorsal Striatum Encodes Prediction Errors during Observational Learning of Instrumental Actions , 2012, Journal of Cognitive Neuroscience.

[32]  K. Laland Social learning strategies , 2004, Learning & behavior.

[33]  N. Guéguen,et al.  Incidental Similarity Facilitates Behavioral Mimicry , 2009 .

[34]  T. Robbins,et al.  Dissociation in prefrontal cortex of affective and attentional shifts , 1996, Nature.

[35]  T. Mussweiler Comparison processes in social judgment: mechanisms and consequences. , 2003, Psychological review.

[36]  Hauke R. Heekeren,et al.  The Neural Basis of Following Advice , 2011, PLoS biology.

[37]  Luca Passamonti,et al.  A Key Role for Similarity in Vicarious Reward , 2009, Science.

[38]  K. Laland,et al.  The evolutionary basis of human social learning , 2012, Proceedings of the Royal Society B: Biological Sciences.

[39]  Lysann Damisch,et al.  Going back to Donald: how comparisons shape judgmental priming effects. , 2008, Journal of personality and social psychology.