Uncertainty in learning, choice, and visual fixation

Significance Humans cannot help but turn their gaze to objects that catch their attention. Our knowledge of the factors that govern this seizure, or of its effects in the context of learned decision making, is currently rather incomplete. We therefore monitored the gaze of human subjects as they learned to choose between multiple options whose value was initially unknown. We found evidence that attention was influenced by uncertainty and that the use of, and reduction in, uncertainty were, in turn, influenced by attention. Our findings provide evidence for approximately optimal models of learning and choice and uncover an intricate interplay between learning, choice, and attentional processes. Uncertainty plays a critical role in reinforcement learning and decision making. However, exactly how it influences behavior remains unclear. Multiarmed-bandit tasks offer an ideal test bed, since computational tools such as approximate Kalman filters can closely characterize the interplay between trial-by-trial values, uncertainty, learning, and choice. To gain additional insight into learning and choice processes, we obtained data from subjects’ overt allocation of gaze. The estimated value and estimation uncertainty of options influenced what subjects looked at before choosing; these same quantities also influenced choice, as additionally did fixation itself. A momentary measure of uncertainty in the form of absolute prediction errors determined how long participants looked at the obtained outcomes. These findings affirm the importance of uncertainty in multiple facets of behavior and help delineate its effects on decision making.

[1]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[2]  U. Neisser VISUAL SEARCH. , 1964, Scientific American.

[3]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[4]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[5]  P. Whittle Multi‐Armed Bandits and the Gittins Index , 1980 .

[6]  J. Pearce,et al.  A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980 .

[7]  B. Anderson,et al.  Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[9]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .

[10]  J. Cohen,et al.  The role of locus coeruleus in the regulation of cognitive performance. , 1999, Science.

[11]  S. Kakade,et al.  Learning and selective attention , 2000, Nature Neuroscience.

[12]  James L. McClelland,et al.  The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.

[13]  H. Critchley,et al.  Neural Activity in the Human Brain Relating to Uncertainty and Arousal during Anticipation , 2001, Neuron.

[14]  J. Jonides,et al.  Overlapping mechanisms of attention and spatial working memory , 2001, Trends in Cognitive Sciences.

[15]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[16]  Dana H. Ballard,et al.  Eye Movements for Reward Maximization , 2003, NIPS.

[17]  S. Shimojo,et al.  Gaze bias both reflects and influences preference , 2003, Nature Neuroscience.

[18]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[19]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[22]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[23]  E. Vogel,et al.  Interactions between attention and working memory , 2006, Neuroscience.

[24]  R. Sutton Gain Adaptation Beats Least Squares , 2006 .

[25]  Jonathan W. Peirce,et al.  PsychoPy—Psychophysics software in Python , 2007, Journal of Neuroscience Methods.

[26]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[27]  Linus Holm,et al.  Memory for scenes: Refixations reflect retrieval , 2007, Memory & cognition.

[28]  Roger Ratcliff,et al.  The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks , 2008, Neural Computation.

[29]  A. Rangel,et al.  Biasing simple choices by manipulating relative visual attention , 2008, Judgment and Decision Making.

[30]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[31]  J. Theeuwes,et al.  Interactions between working memory, attention and eye movements. , 2009, Acta psychologica.

[32]  Ian Krajbich,et al.  Visual fixations and the computation and comparison of value in simple choice , 2010, Nature Neuroscience.

[33]  N. Mackintosh,et al.  Two theories of attention: a review and a possible integration , 2010 .

[34]  J. Wolfe,et al.  Visual search , 2008, Scholarpedia.

[35]  Robert C. Wilson,et al.  An Approximately Bayesian Delta-Rule Model Explains the Dynamics of Belief Updating in a Changing Environment , 2010, The Journal of Neuroscience.

[36]  A. Rangel,et al.  Visual fixations and the computation and comparison of value in simple choice. , 2010, Nature neuroscience.

[37]  Peter Bossaerts,et al.  Risk, Unexpected Uncertainty, and Estimation Uncertainty: Bayesian Learning in Unstable Settings , 2011, PLoS Comput. Biol..

[38]  Kevin D. Glazebrook,et al.  Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .

[39]  Raymond J. Dolan,et al.  Deconstructing risk: Separable encoding of variance and skewness in the brain , 2011, NeuroImage.

[40]  Kenneth Holmqvist,et al.  Eye tracking: a comprehensive guide to methods and measures , 2011 .

[41]  Michael Smithson,et al.  Doing Bayesian Data Analysis: A Tutorial with R and BUGS, J.J. Kruschke. Academic Press (2011), 653, $89.95Reviewed by Michael Smithson, ISBN: 9780123814852 , 2011 .

[42]  Roger Ratcliff,et al.  Reinforcement-Based Decision Making in Corticostriatal Circuits: Mutual Constraints by Neurocomputational and Diffusion Models , 2012, Neural Computation.

[43]  D. Ballard,et al.  The role of uncertainty and reward on eye movements in a virtual driving task. , 2012, Journal of vision.

[44]  Julie M. Harris,et al.  Optimal integration of shading and binocular disparity for depth perception. , 2012, Journal of vision.

[45]  P. Stone,et al.  The Nature of Belief-Directed Exploratory Choice in Human Decision-Making , 2011, Front. Psychology.

[46]  Yingyao Hu,et al.  Nonparametric learning rules from bandit experiments: The eyes have it! , 2010, Games Econ. Behav..

[47]  Jacob L. Orquin,et al.  Attention and choice: a review on eye movements in decision making. , 2013, Acta psychologica.

[48]  M. Husain,et al.  Attention as foraging for information and value , 2013, Front. Hum. Neurosci..

[49]  M. Betancourt,et al.  Hamiltonian Monte Carlo for Hierarchical Models , 2013, 1312.0906.

[50]  T. Egner,et al.  Working memory as internal attention: Toward an integrative account of internal and external selection processes , 2012, Psychonomic Bulletin & Review.

[51]  Jonathan D. Cohen,et al.  Humans use directed and random exploration to solve the explore-exploit dilemma. , 2014, Journal of experimental psychology. General.

[52]  Thomas V. Wiecki,et al.  Eye tracking and pupillometry are indicators of dissociable latent decision processes. , 2014, Journal of experimental psychology. General.

[53]  M. Johansson,et al.  Look Here, Eye Movements Play a Functional Role in Memory Retrieval , 2014, Psychological science.

[54]  Li Zhaoping,et al.  Understanding Vision: Theory, Models, and Data , 2014 .

[55]  Wei Ji Ma,et al.  Neural coding of uncertainty and probability. , 2014, Annual review of neuroscience.

[56]  Jacob L. Orquin,et al.  Effects of salience are both short- and long-lived. , 2015, Acta psychologica.

[57]  J. Kruschke Chapter 8 – JAGS , 2015 .

[58]  M. Usher,et al.  Post choice information integration as a causal determinant of confidence: Novel data and a computational account , 2015, Cognitive Psychology.

[59]  Maarten Speekenbrink,et al.  Uncertainty and Exploration in a Restless Bandit Problem , 2015, Top. Cogn. Sci..

[60]  Stefan J. Kiebel,et al.  Modeling the Evolution of Beliefs Using an Attentional Focus Mechanism , 2015, PLoS Comput. Biol..

[61]  M. L. Le Pelley,et al.  Uncertainty and predictiveness determine attention to cues during human associative learning , 2015, Quarterly journal of experimental psychology.

[62]  Robert C. Wilson,et al.  Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.

[63]  Thomas V. Wiecki,et al.  fMRI and EEG Predictors of Dynamic Decision Parameters during Human Reinforcement Learning , 2015, The Journal of Neuroscience.

[64]  Arkady Konovalov,et al.  Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning , 2016, Nature Communications.

[65]  Jacob L. Orquin,et al.  Eyes on the Prize?: Evidence of Diminishing Attention to Experienced and Foregone Outcomes in Repeated Experiential Choice , 2016 .

[66]  Taylor R. Hayes,et al.  Mapping and correcting the influence of gaze position on pupil size measurements , 2015, Behavior Research Methods.

[67]  Yuan Chang Leong,et al.  Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments , 2017, Neuron.

[68]  Paul-Christian Bürkner,et al.  brms: An R Package for Bayesian Multilevel Models Using Stan , 2017 .

[69]  M. Frank,et al.  The drift diffusion model as the choice rule in reinforcement learning , 2017, Psychonomic bulletin & review.

[70]  David S. Leslie,et al.  A tutorial on bridge sampling , 2017, Journal of mathematical psychology.

[71]  Robert C. Wilson,et al.  A causal role for right frontopolar cortex in directed, but not random, exploration , 2016, bioRxiv.

[72]  M. Speekenbrink,et al.  Putting bandits into context: How function learning supports decision making , 2016, bioRxiv.

[73]  S. Gershman Deconstructing the human algorithms for exploration , 2018, Cognition.

[74]  M. Speekenbrink,et al.  It's new, but is it good? How generalization and uncertainty guide the exploration of novel options. , 2018, Journal of experimental psychology. General.

[75]  Martin Schoemann,et al.  Forward inference in risky choice: Mapping gaze and decision processes , 2019, Journal of Behavioral Decision Making.

[76]  Peter Dayan,et al.  A computational account of threat-related attentional bias , 2019, PLoS Comput. Biol..

[77]  Uncertainty in learning, choice and visual fixation , 2019 .

[78]  A. Dietrich,et al.  Types of creativity , 2018, Psychonomic Bulletin & Review.

[79]  M. L. Le Pelley,et al.  The role of uncertainty in attentional and choice exploration , 2019, Psychonomic Bulletin & Review.

[80]  Robert C. Wilson,et al.  Ten simple rules for the computational modeling of behavioral data , 2019, eLife.

[81]  Timothy J. Pleskac,et al.  Under pressure: The influence of time limits on human exploration , 2019, CogSci.

[82]  Charles Blundell,et al.  Confidence modulates exploration and exploitation in value-based learning , 2019, Neuroscience of consciousness.

[83]  Tsuyoshi Murata,et al.  {m , 1934, ACML.