Modeling changes in probabilistic reinforcement learning during adolescence

In the real world, many relationships between events are uncertain and probabilistic. Uncertainty is also likely to be a more common feature of daily experience for youth because they have less experience to draw from than adults. Some studies suggests probabilistic learning may be inefficient in youth compared to adults [1], while others suggest it may be more efficient in youth that are in mid adolescence [2, 3]. Here we used a probabilistic reinforcement learning task to test how youth age 8-17 (N = 187) and adults age 18-30 (N = 110) learn about stable probabilistic contingencies. Performance increased with age through early-twenties, then stabilized. Using hierarchical Bayesian methods to fit computational reinforcement learning models, we show that all participants’ performance was better explained by models in which negative outcomes had minimal to no impact on learning. The performance increase over age was driven by 1) an increase in learning rate (i.e. decrease in integration time horizon); 2) a decrease in noisy/exploratory choices. In mid-adolescence age 13-15, salivary testosterone and learning rate were positively related. We discuss our findings in the context of other studies and hypotheses about adolescent brain development. Author summary Adolescence is a time of great uncertainty. It is also a critical time for brain development, learning, and decision making in social and educational domains. There are currently contradictory findings about learning in adolescence. We sought to better isolate how learning from stable probabilistic contingencies changes during adolescence with a task that previously showed interesting results in adolescents. We collected a relatively large sample size (297 participants) across a wide age range (8-30), to trace the adolescent developmental trajectory of learning under stable but uncertain conditions. We found that age in our sample was positively associated with higher learning rates and lower choice exploration. Within narrow age bins, we found that higher saliva testosterone levels were associated with higher learning rates in participants age 13-15 years. These findings can help us better isolate the trajectory of maturation of core learning and decision making processes during adolescence.

[1]  Ronald E. Dahl,et al.  Reinforcement Learning and Bayesian Inference Provide Complementary Models for the Unique Advantage of Adolescents in Stochastic Reversal , 2020, bioRxiv.

[2]  Gunnar Blohm,et al.  Appreciating the variety of goals in computational neuroscience. , 2020, 2002.03211.

[3]  E. Telzer,et al.  Modernizing Conceptions of Valuation and Cognitive-Control Deployment in Adolescent Risk Taking , 2020, Current directions in psychological science.

[4]  T. Yarkoni,et al.  The generalizability crisis , 2019, Behavioral and Brain Sciences.

[5]  Anne G E Collins,et al.  Modeling the influence of working memory, reinforcement, and action uncertainty on reaction time and choice during instrumental learning , 2019, Psychonomic bulletin & review.

[6]  Maria K. Eckstein,et al.  Distentangling the systems contributing to changes in learning during adolescence , 2019, Developmental Cognitive Neuroscience.

[7]  Willem E. Frankenhuis,et al.  Modeling the evolution of sensitive periods , 2019, Developmental Cognitive Neuroscience.

[8]  Catherine A. Hartley,et al.  Reinforcement learning across development: What insights can we draw from a decade of research? , 2019, Developmental Cognitive Neuroscience.

[9]  Benjamin W. Nelson,et al.  Study Protocol: Transitions in Adolescent Girls (TAG) , 2019, Frontiers in Psychiatry.

[10]  Hauke R Heekeren,et al.  The computational basis of following advice in adolescents. , 2018, Journal of experimental child psychology.

[11]  Robert C. Wilson,et al.  Ten simple rules for the computational modeling of behavioral data , 2019, eLife.

[12]  L. Wilbrecht,et al.  Adolescence and "Late Blooming" Synapses of the Prefrontal Cortex. , 2019, Cold Spring Harbor symposia on quantitative biology.

[13]  Kentaro Katahira,et al.  The statistical structures of reinforcement learning with asymmetric value updates , 2018, Journal of Mathematical Psychology.

[14]  Michael Moutoussis,et al.  Change, stability, and instability in the Pavlovian guidance of behaviour from adolescence to young adulthood , 2018, PLoS Comput. Biol..

[15]  B. Luna,et al.  Adolescence as a neurobiological critical period for the development of higher-order cognition , 2018, Neuroscience & Biobehavioral Reviews.

[16]  D. Navarro Between the Devil and the Deep Blue Sea: Tensions Between Scientific Judgement and Statistical Model Selection , 2018, Computational Brain & Behavior.

[17]  Michael X. Cohen,et al.  How the Level of Reward Awareness Changes the Computational and Electrophysiological Signatures of Reinforcement Learning , 2018, The Journal of Neuroscience.

[18]  U. Simonsohn Two Lines: A Valid Alternative to the Invalid Testing of U-Shaped Relationships With Quadratic Regressions , 2018, Advances in Methods and Practices in Psychological Science.

[19]  Nicholas B. Allen,et al.  Importance of investing in adolescence from a developmental science perspective , 2018, Nature.

[20]  Daeyeol Lee,et al.  Feature-based learning improves adaptability without compromising precision , 2017, Nature Communications.

[21]  M. Paul,et al.  Adolescence and Reward: Making Sense of Neural and Behavioral Changes Amid the Chaos , 2017, The Journal of Neuroscience.

[22]  Adriana Galván,et al.  Frontostriatal development and probabilistic reinforcement learning during adolescence , 2017, Neurobiology of Learning and Memory.

[23]  R. Dahl,et al.  Social status strategy in early adolescent girls: Testosterone and value-based decision making , 2017, Psychoneuroendocrinology.

[24]  E. Koechlin,et al.  The Importance of Falsification in Computational Cognitive Modeling , 2017, Trends in Cognitive Sciences.

[25]  M. Lebreton,et al.  Behavioural and neural characterization of optimistic reinforcement learning , 2017, Nature Human Behaviour.

[26]  Yuan Chang Leong,et al.  Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments , 2017, Neuron.

[27]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[28]  Josiah R. Boivin,et al.  Does puberty mark a transition in sensitive periods for plasticity in the associative neocortex? , 2017, Brain Research.

[29]  Juliet Y. Davidow,et al.  An Upside to Reward Sensitivity: The Hippocampus Supports Enhanced Reinforcement Learning in Adolescence , 2016, Neuron.

[30]  Kentaro Katahira,et al.  How hierarchical models improve point estimates of model parameters at the individual level , 2016 .

[31]  Stefano Palminteri,et al.  The Computational Development of Reinforcement Learning during Adolescence , 2016, PLoS Comput. Biol..

[32]  S. Blakemore,et al.  Adolescence as a Sensitive Period of Brain Development , 2015, Trends in Cognitive Sciences.

[33]  A. V. van Duijvenvoorde,et al.  Longitudinal Changes in Adolescent Risk-Taking: A Comprehensive Study of Neural Responses to Rewards, Pubertal Development, and Risk-Taking Behavior , 2015, The Journal of Neuroscience.

[34]  B. B. Doll,et al.  Experiential reward learning outweighs instruction prior to adulthood , 2015, Cognitive, affective & behavioral neuroscience.

[35]  Daniel Brandeis,et al.  Cognitive flexibility in adolescence: Neural and behavioral mechanisms of reward prediction error processing in adaptive decision making during development , 2015, NeuroImage.

[36]  Amir Homayoun Javadi,et al.  Adolescents Adapt More Slowly than Adults to Varying Reward Contingencies , 2014, Journal of Cognitive Neuroscience.

[37]  Jeffrey M Spielberg,et al.  The role of testosterone and estradiol in brain volume changes across adolescence: A longitudinal structural MRI study , 2014, Human brain mapping.

[38]  L. Somerville,et al.  Adolescent-specific patterns of behavior and neural activity during social reinforcement learning , 2014, Cognitive, affective & behavioral neuroscience.

[39]  R. Dahl,et al.  Exciting fear in adolescence: Does pubertal development alter threat processing? , 2014, Developmental Cognitive Neuroscience.

[40]  Karl J. Friston,et al.  Bayesian model selection for group studies — Revisited , 2014, NeuroImage.

[41]  Matthijs A. A. van der Meer,et al.  Adaptive properties of differential learning rates for positive and negative outcomes , 2013, Biological Cybernetics.

[42]  Michael J. Brammer,et al.  Neural and Psychological Maturation of Decision-making in Adolescence and Young Adulthood , 2013, Journal of Cognitive Neuroscience.

[43]  Daniel C. McNamee,et al.  Category-dependent and category-independent goal-value codes in human ventromedial prefrontal cortex , 2013, Nature Neuroscience.

[44]  Sumio Watanabe,et al.  A widely applicable Bayesian information criterion , 2012, J. Mach. Learn. Res..

[45]  David B. Dunson,et al.  Bayesian data analysis, third edition , 2013 .

[46]  Michael X. Cohen,et al.  Striatum-medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. , 2012, Cerebral cortex.

[47]  Anne G E Collins,et al.  How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis , 2012, The European journal of neuroscience.

[48]  E. Crone,et al.  Distinct linear and non-linear trajectories of reward and punishment reversal learning during development: Relevance for dopamine's role in adolescent decision making , 2011, Developmental Cognitive Neuroscience.

[49]  Bregtje Gunther Moor,et al.  Developmental Cognitive Neuroscience Testosterone Levels Correspond with Increased Ventral Striatum Activation in Response to Monetary Rewards in Adolescents , 2022 .

[50]  E. Crone,et al.  Sex steroids and brain structure in pubertal boys and girls: a mini-review of neuroimaging studies , 2011, Neuroscience.

[51]  T. Robbins,et al.  Decision Making, Affect, and Learning: Attention and Performance XXIII , 2011 .

[52]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[53]  J. O'Doherty,et al.  Overlapping responses for the expectation of juice and money rewards in human ventromedial prefrontal cortex. , 2011, Cerebral cortex.

[54]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[55]  Leah H. Somerville,et al.  A time of change: Behavioral and neural correlates of adolescent sensitivity to appetitive and aversive environmental cues , 2010, Brain and Cognition.

[56]  Vivian V. Valentin,et al.  Overlapping prediction errors in dorsal striatum during instrumental learning with juice and money reward in the human brain. , 2009, Journal of neurophysiology.

[57]  J. O'Doherty,et al.  Evidence for a Common Representation of Decision Values for Dissimilar Goods in Human Ventromedial Prefrontal Cortex , 2009, The Journal of Neuroscience.

[58]  S. Rombouts,et al.  Better than Expected or as Bad as You Thought? The Neurocognitive Development of Probabilistic Feedback Processing , 2009, Front. Hum. Neurosci..

[59]  Karl J. Friston,et al.  Bayesian model selection for group studies , 2009, NeuroImage.

[60]  Y. Niv Reinforcement learning in the brain , 2009 .

[61]  L. Steinberg A Social Neuroscience Perspective on Adolescent Risk-Taking. , 2008, Developmental review : DR.

[62]  Michael J. Frank,et al.  Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[63]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[64]  G. Glover,et al.  Earlier Development of the Accumbens Relative to Orbitofrontal Cortex Might Underlie Risk-Taking Behavior in Adolescents , 2006, The Journal of Neuroscience.

[65]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[66]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[67]  H. Kraemer,et al.  How can we learn about developmental processes from cross-sectional studies, or can we? , 2000, The American journal of psychiatry.

[68]  A. Petersen,et al.  A self-report measure of pubertal status: Reliability, validity, and initial norms , 1988, Journal of youth and adolescence.

[69]  H. Akaike A new look at the statistical model identification , 1974 .