Feature-based learning improves adaptability without compromising precision

Learning from reward feedback is essential for survival but can become extremely challenging with myriad choice options. Here, we propose that learning reward values of individual features can provide a heuristic for estimating reward values of choice options in dynamic, multi-dimensional environments. We hypothesize that this feature-based learning occurs not just because it can reduce dimensionality, but more importantly because it can increase adaptability without compromising precision of learning. We experimentally test this hypothesis and find that in dynamic environments, human subjects adopt feature-based learning even when this approach does not reduce dimensionality. Even in static, low-dimensional environments, subjects initially adopt feature-based learning and gradually switch to learning reward values of individual options, depending on how accurately objects’ values can be predicted by combining feature values. Our computational models reproduce these results and highlight the importance of neurons coding feature values for parallel learning of values for features and objects.Learning about a rewarded outcome is complicated by the fact that a choice often incorporates multiple features with differing association with the reward. Here the authors demonstrate that feature-based learning is an efficient and adaptive strategy in dynamically changing environments.

[1]  Y. Niv,et al.  Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[2]  H. Seo,et al.  Neural basis of reinforcement learning and decision making. , 2012, Annual review of neuroscience.

[3]  P. Zelazo,et al.  An age-related dissociation between knowing rules and using them ☆ , 1996 .

[4]  Christof Koch,et al.  Visual Saliency Computations: Mechanisms, Constraints, and the Effect of Feedback , 2010, The Journal of Neuroscience.

[5]  Stefano Fusi,et al.  Why neurons mix: high dimensionality for higher cognition , 2016, Current Opinion in Neurobiology.

[6]  Shlomo Zilberstein,et al.  Models of Bounded Rationality , 1995 .

[7]  Xiao-Jing Wang,et al.  From biophysics to cognition: reward-dependent adaptive choice behavior , 2008, Current Opinion in Neurobiology.

[8]  Tirin Moore,et al.  Combined contributions of feedforward and feedback inputs to bottom-up attention , 2015, Front. Psychol..

[9]  Natasha Z. Kirkham,et al.  ARTICLE WITH PEER COMMENTARIES AND RESPONSE Helping children apply their knowledge to their behavior on a dimension-switching task , 2003 .

[10]  Avinash R. Vaidya,et al.  Neural Mechanisms for Undoing the “Curse of Dimensionality” , 2015, The Journal of Neuroscience.

[11]  C. H. Donahue,et al.  Metaplasticity as a Neural Substrate for Adaptive Learning and Choice under Uncertainty , 2017, Neuron.

[12]  Alireza Soltani,et al.  Optimal structure of metaplasticity for adaptive learning , 2017, bioRxiv.

[13]  Yuan Chang Leong,et al.  Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments , 2017, Neuron.

[14]  T. Maia Reinforcement learning, conditioning, and the brain: Successes and challenges , 2009, Cognitive, affective & behavioral neuroscience.

[15]  Timothy E. J. Behrens,et al.  Choice, uncertainty and value in prefrontal and cingulate cortex , 2008, Nature Neuroscience.

[16]  Joseph T. McGuire,et al.  A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.

[17]  Timothy Edward John Behrens,et al.  Reward-Guided Learning with and without Causal Attribution , 2016, Neuron.

[18]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[19]  Xiao-Jing Wang,et al.  Neural mechanism for stochastic behaviour during a competitive game , 2006, Neural Networks.

[20]  Soyoung Q. Park,et al.  How Glitter Relates to Gold: Similarity-Dependent Reward Prediction Errors in the Human Striatum , 2012, The Journal of Neuroscience.

[21]  John W. Payne,et al.  The adaptive decision maker: Name index , 1993 .

[22]  Shinsuke Shimojo,et al.  Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.

[23]  Ulrik R Beierholm,et al.  The human prefrontal cortex mediates integration of potential causes behind observed outcomes , 2011, Journal of neurophysiology.

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[26]  A. Tversky Elimination by aspects: A theory of choice. , 1972 .

[27]  Timothy E. J. Behrens,et al.  Hierarchical competitions subserving multi-attribute choice , 2014, Nature Neuroscience.

[28]  S. Kennerley,et al.  Heterogeneous reward signals in prefrontal cortex , 2010, Current Opinion in Neurobiology.

[29]  Charles E Connor,et al.  Underlying principles of visual shape selectivity in posterior inferotemporal cortex , 2004, Nature Neuroscience.

[30]  Eric J. Johnson,et al.  The adaptive decision maker , 1993 .

[31]  P. Tobler,et al.  Dopamine regulates stimulus generalization in the human hippocampus , 2016, eLife.

[32]  John H. Reif,et al.  Successes and challenges , 2021, Strategic Community Partnerships, Philanthropy, and Nongovernmental Organization.

[33]  M. Botvinick Hierarchical reinforcement learning and decision making , 2012, Current Opinion in Neurobiology.

[34]  L. Hunt,et al.  A mechanism for value-guided choice based on the excitation-inhibition balance in prefrontal cortex , 2012, Nature Neuroscience.

[35]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[36]  Nicholas P. Tatonetti,et al.  Ten Simple Rules to Enable Multi-site Collaborations through Data Sharing , 2017, PLoS Comput. Biol..

[37]  C. H. Donahue,et al.  Dynamic Routing of Task-relevant Signals for Decision Making in Dorsolateral Prefrontal Cortex , 2015, Nature Neuroscience.

[38]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[39]  Ayzerman,et al.  Theory of choice , 1995 .

[40]  Shiva Farashahi,et al.  Neural substrates of cognitive biases during probabilistic inference , 2016, Nature Communications.

[41]  Michael O'Rourke,et al.  Carving Nature at its Joints , 2011 .

[42]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[43]  Daniel A. Braun,et al.  Structure learning in action , 2010, Behavioural Brain Research.

[44]  Robert C. Wilson,et al.  Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.

[45]  Robert C. Wilson,et al.  Inferring Relevance in a Changing World , 2012, Front. Hum. Neurosci..

[46]  Xiao-Jing Wang,et al.  A Biophysically Based Neural Model of Matching Law Behavior: Melioration by Stochastic Synapses , 2006, The Journal of Neuroscience.

[47]  Xiao-Jing Wang,et al.  Synaptic computation underlying probabilistic inference , 2010, Nature Neuroscience.

[48]  Xiao-Jing Wang,et al.  The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.

[49]  K. Doya,et al.  Validation of Decision-Making Models and Analysis of Decision Variables in the Rat Basal Ganglia , 2009, The Journal of Neuroscience.

[50]  Jonathan D. Cohen,et al.  The effects of neural gain on attention and learning , 2013, Nature Neuroscience.

[51]  Carlos Diuk,et al.  Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.

[52]  P. Dayan,et al.  Model-based and model-free Pavlovian reward learning: Revaluation, revision, and revelation , 2014, Cognitive, affective & behavioral neuroscience.

[53]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[54]  G Gigerenzer,et al.  Reasoning the fast and frugal way: models of bounded rationality. , 1996, Psychological review.

[55]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .