Your favorite color makes learning more precise and adaptable

Learning from reward feedback is essential for survival but can become extremely challenging when choice options have multiple features and feature values (curse of dimensionality). Here, we propose a general framework for learning reward values in dynamic multi-dimensional environments via encoding and updating the average value of individual features. We predicted that this feature-based learning occurs not just because it can reduce dimensionality, but more importantly because it can increase adaptability without compromising precision. We experimentally tested this novel prediction and found that in dynamic environments, human subjects adopted feature-based learning even when this approach does not reduce dimensionality. Even in static low-dimensional environment, subjects initially tended to adopt feature-based learning and switched to learning individual option values only when feature values could not accurately predict all objects values. Moreover, behaviors of two alternative network models demonstrated that hierarchical decision-making and learning could account for our experimental results and thus provides a plausible mechanism for model adoption during learning in dynamic environments. Our results constrain neural mechanisms underlying learning in dynamic multi-dimensional environments, and highlight the importance of neurons encoding the value of individual features in this learning.

[1]  S. Kennerley,et al.  Heterogeneous reward signals in prefrontal cortex , 2010, Current Opinion in Neurobiology.

[2]  Xiao-Jing Wang,et al.  Neural mechanism for stochastic behaviour during a competitive game , 2006, Neural Networks.

[3]  Charles E Connor,et al.  Underlying principles of visual shape selectivity in posterior inferotemporal cortex , 2004, Nature Neuroscience.

[4]  Eric J. Johnson,et al.  The adaptive decision maker , 1993 .

[5]  Daniel A. Braun,et al.  Structure learning in action , 2010, Behavioural Brain Research.

[6]  Xiao-Jing Wang,et al.  A Biophysically Based Neural Model of Matching Law Behavior: Melioration by Stochastic Synapses , 2006, The Journal of Neuroscience.

[7]  A. Tversky Elimination by aspects: A theory of choice. , 1972 .

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  G Gigerenzer,et al.  Reasoning the fast and frugal way: models of bounded rationality. , 1996, Psychological review.

[10]  Shinsuke Shimojo,et al.  Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.

[11]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[12]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[13]  Y. Niv,et al.  Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[14]  H. Seo,et al.  Neural basis of reinforcement learning and decision making. , 2012, Annual review of neuroscience.

[15]  Avinash R. Vaidya,et al.  Neural Mechanisms for Undoing the “Curse of Dimensionality” , 2015, The Journal of Neuroscience.

[16]  T. Maia Reinforcement learning, conditioning, and the brain: Successes and challenges , 2009, Cognitive, affective & behavioral neuroscience.

[17]  Y. Loewenstein,et al.  Reinforcement learning and human behavior , 2014, Current Opinion in Neurobiology.

[18]  Ulrik R Beierholm,et al.  The human prefrontal cortex mediates integration of potential causes behind observed outcomes , 2011, Journal of neurophysiology.

[19]  John W. Payne,et al.  The adaptive decision maker: Name index , 1993 .

[20]  M. van der Schoot,et al.  The developmental onset of symbolic approximation: beyond nonsymbolic representations, the language of numbers matters , 2015, Front. Psychol..

[21]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[22]  Robert C. Wilson,et al.  Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.

[23]  Robert C. Wilson,et al.  Inferring Relevance in a Changing World , 2012, Front. Hum. Neurosci..

[24]  Luke J. Chang,et al.  Connectivity-Based Parcellation of the Human Orbitofrontal Cortex , 2012, The Journal of Neuroscience.

[25]  Shiva Farashahi,et al.  Neural substrates of cognitive biases during probabilistic inference , 2016, Nature Communications.

[26]  Xiao-Jing Wang,et al.  Synaptic computation underlying probabilistic inference , 2010, Nature Neuroscience.

[27]  Xiao-Jing Wang,et al.  The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.

[28]  L. Hunt,et al.  A mechanism for value-guided choice based on the excitation-inhibition balance in prefrontal cortex , 2012, Nature Neuroscience.

[29]  Carlos Diuk,et al.  Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.

[30]  P. Dayan,et al.  Model-based and model-free Pavlovian reward learning: Revaluation, revision, and revelation , 2014, Cognitive, affective & behavioral neuroscience.

[31]  Timothy E. J. Behrens,et al.  Hierarchical competitions subserving multi-attribute choice , 2014, Nature Neuroscience.

[32]  M. Botvinick Hierarchical reinforcement learning and decision making , 2012, Current Opinion in Neurobiology.

[33]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[34]  P. Zelazo,et al.  An age-related dissociation between knowing rules and using them ☆ , 1996 .

[35]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[36]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[37]  C. H. Donahue,et al.  Dynamic Routing of Task-relevant Signals for Decision Making in Dorsolateral Prefrontal Cortex , 2015, Nature Neuroscience.

[38]  Joseph T. McGuire,et al.  A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.

[39]  Timothy Edward John Behrens,et al.  Reward-Guided Learning with and without Causal Attribution , 2016, Neuron.

[40]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[41]  K. Doya,et al.  Validation of Decision-Making Models and Analysis of Decision Variables in the Rat Basal Ganglia , 2009, The Journal of Neuroscience.

[42]  Jonathan D. Cohen,et al.  The effects of neural gain on attention and learning , 2013, Nature Neuroscience.

[43]  C. H. Donahue,et al.  Metaplasticity as a Neural Substrate for Adaptive Learning and Choice under Uncertainty , 2017, Neuron.

[44]  Alireza Soltani,et al.  Optimal structure of metaplasticity for adaptive learning , 2017, bioRxiv.

[45]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[46]  P. Tobler,et al.  Dopamine regulates stimulus generalization in the human hippocampus , 2016, eLife.

[47]  Christof Koch,et al.  Visual Saliency Computations: Mechanisms, Constraints, and the Effect of Feedback , 2010, The Journal of Neuroscience.

[48]  Stefano Fusi,et al.  Why neurons mix: high dimensionality for higher cognition , 2016, Current Opinion in Neurobiology.

[49]  Soyoung Q. Park,et al.  How Glitter Relates to Gold: Similarity-Dependent Reward Prediction Errors in the Human Striatum , 2012, The Journal of Neuroscience.

[50]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[51]  Xiao-Jing Wang,et al.  From biophysics to cognition: reward-dependent adaptive choice behavior , 2008, Current Opinion in Neurobiology.

[52]  Tirin Moore,et al.  Combined contributions of feedforward and feedback inputs to bottom-up attention , 2015, Front. Psychol..

[53]  Natasha Z. Kirkham,et al.  ARTICLE WITH PEER COMMENTARIES AND RESPONSE Helping children apply their knowledge to their behavior on a dimension-switching task , 2003 .

[54]  Yuan Chang Leong,et al.  Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments , 2017, Neuron.

[55]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[56]  C. H. Donahue,et al.  Neural correlates of strategic reasoning during competitive games , 2014, Science.