Dynamic Estimation of Task-Relevant Variance in Movement under Risk

Humans take into account their own movement variability as well as potential consequences of different movement outcomes in planning movement trajectories. When variability increases, planned movements are altered so as to optimize expected consequences of the movement. Past research has focused on the steady-state responses to changing conditions of movement under risk. Here, we study the dynamics of such strategy adjustment in a visuomotor decision task in which subjects reach toward a display with regions that lead to rewards and penalties, under conditions of changing uncertainty. In typical reinforcement learning tasks, subjects should base subsequent strategy by computing an estimate of the mean outcome (e.g., reward) in recent trials. In contrast, in our task, strategy should be based on a dynamic estimate of recent outcome uncertainty (i.e., squared error). We find that subjects respond to increased movement uncertainty by aiming movements more conservatively with respect to penalty regions, and that the estimate of uncertainty they use is well characterized by a weighted average of recent squared errors, with higher weights given to more recent trials.

[1]  J. Wolfowitz,et al.  An Introduction to the Theory of Statistics , 1951, Nature.

[2]  R. Rescorla A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[3]  H. J. Larson,et al.  Introduction to the Theory of Statistics , 1973 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[6]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[7]  Peter Dayan,et al.  Statistical Models of Conditioning , 1997, NIPS.

[8]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[9]  Michael I. Jordan,et al.  Obstacle Avoidance and a Perturbation Sensitivity Model for Motor Planning , 1997, The Journal of Neuroscience.

[10]  D G Pelli,et al.  The VideoToolbox software for visual psychophysics: transforming numbers into movies. , 1997, Spatial vision.

[11]  Michael I. Jordan,et al.  The Role of Inertial Sensitivity in Motor Planning , 1998, The Journal of Neuroscience.

[12]  Karl J. Friston,et al.  Generalisability, Random Effects & Population Inference , 1998, NeuroImage.

[13]  Anthony M. Zador,et al.  Asymmetric Dynamics in Optimal Variance Adaptation , 1998, Neural Computation.

[14]  Thomas G. Dietterich Adaptive computation and machine learning , 1998 .

[15]  S. Kakade,et al.  Learning and selective attention , 2000, Nature Neuroscience.

[16]  Peter Dayan,et al.  Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[17]  Michael S Landy,et al.  Statistical decision theory and the selection of rapid, goal-directed movements. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[18]  M. Landy,et al.  Statistical decision theory and trade-offs in the control of motor response. , 2003, Spatial vision.

[19]  R C Miall,et al.  System Identification Applied to a Visuomotor Task: Near-Optimal Human Performance in a Noisy Changing Task , 2003, The Journal of Neuroscience.

[20]  Konrad Paul Kording,et al.  Bayesian integration in sensorimotor learning , 2004, Nature.

[21]  M. Landy,et al.  Optimal Compensation for Changes in Task-Relevant Movement Variability , 2005, The Journal of Neuroscience.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[24]  W. Newsome,et al.  Choosing the greater of two goods: neural currencies for valuation and decision making , 2005, Nature Reviews Neuroscience.

[25]  H. Seung,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 581–617 NUMBER 3(NOVEMBER) LINEAR-NONLINEAR-POISSON MODELS OF PRIMATE CHOICE DYNAMICS , 2022 .

[26]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[27]  Michael S Landy,et al.  Combining Priors and Noisy Visual Cues in a Rapid Pointing Task , 2006, The Journal of Neuroscience.

[28]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[29]  S. Quartz,et al.  Neural Differentiation of Expected Reward and Risk in Human Subcortical Structures , 2006, Neuron.

[30]  M. Landy,et al.  Humans Rapidly Estimate Expected Gain in Movement Planning , 2006, Psychological science.

[31]  R. Sutton Gain Adaptation Beats Least Squares , 2006 .

[32]  Aaron C. Courville,et al.  Bayesian theories of conditioning in a changing world , 2006, Trends in Cognitive Sciences.

[33]  K. Preuschoff,et al.  Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[34]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[35]  M. Landy,et al.  Visual estimation under risk. , 2007, Journal of vision.

[36]  Peter W Battaglia,et al.  Humans Trade Off Viewing Time and Movement Duration to Improve Visuomotor Accuracy in a Fast Reaching Task , 2007, The Journal of Neuroscience.

[37]  S. Gepshtein,et al.  Optimality of human movement under natural variations of visual-motor uncertainty. , 2007, Journal of vision.

[38]  Michael S. Landy,et al.  Questions Without Words: A Comparison Between Decision Making Under Risk and Movement Planning Under Risk , 2007, Integrated Models of Cognitive Systems.

[39]  M. Ernst,et al.  The statistical determinants of adaptation rate in human reaching. , 2008, Journal of vision.

[40]  S. Quartz,et al.  Human Insula Activation Reflects Risk Prediction Errors As Well As Risk , 2008, The Journal of Neuroscience.

[41]  M. Frank,et al.  Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. , 2009, Nature neuroscience.

[42]  Guillem R. Esber,et al.  Neural Correlates of Variations in Event Processing during Learning in Basolateral Amygdala , 2010, The Journal of Neuroscience.

[43]  Michael S. Landy,et al.  Compensation for Changing Motor Uncertainty , 2010, PLoS Comput. Biol..

[44]  N. Daw,et al.  Differential roles of human striatum and amygdala in associative learning , 2011, Nature Neuroscience.

[45]  Guillem R. Esber,et al.  Surprise! Neural correlates of Pearce–Hall and Rescorla–Wagner coexist within the brain , 2012, The European journal of neuroscience.