The gradient of the reinforcement landscape influences sensorimotor learning

Consideration of previous successes and failures is essential to mastering a motor skill. Much of what we know about how humans and animals learn from such reinforcement feedback comes from experiments that involve sampling from a small number of discrete actions. Yet, it is less understood how we learn through reinforcement feedback when sampling from a continuous set of possible actions. Navigating a continuous set of possible actions likely requires using gradient information to maximize success. Here we addressed how humans adapt the aim of their hand when experiencing reinforcement feedback that was associated with a continuous set of possible actions. Specifically, we manipulated the change in the probability of reward given a change in motor action—the reinforcement gradient—to study its influence on learning. We found that participants learned faster when exposed to a steep gradient compared to a shallow gradient. Further, when initially positioned between a steep and a shallow gradient that rose in opposite directions, participants were more likely to ascend the steep gradient. We introduce a model that captures our results and several features of motor learning. Taken together, our work suggests that the sensorimotor system relies on temporally recent and spatially local gradient information to drive learning.

[1]  Michael S Landy,et al.  Statistical decision theory and the selection of rapid, goal-directed movements. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[2]  M. Landy,et al.  Optimal Compensation for Changes in Task-Relevant Movement Variability , 2005, The Journal of Neuroscience.

[3]  Reza Shadmehr,et al.  Learning from Sensory and Reward Prediction Errors during Motor Adaptation , 2011, PLoS Comput. Biol..

[4]  J. Rothwell,et al.  The dissociable effects of punishment and reward on motor learning , 2015, Nature Neuroscience.

[5]  Jonathan Taylor,et al.  A statistician plays darts , 2011 .

[6]  A. Cangelosi,et al.  Active lead-in variability affects motor memory formation and slows motor learning , 2017, Scientific Reports.

[7]  J. Krakauer,et al.  How is a motor skill learned? Change and invariance at the levels of task success and trajectory control. , 2012, Journal of neurophysiology.

[8]  Jonathan B. Dingwell,et al.  Error Correction and the Structure of Inter-Trial Fluctuations in a Redundant Movement Task , 2016, PLoS Comput. Biol..

[9]  Nathaniel D. Daw,et al.  Human Representation of Visuo-Motor Uncertainty as Mixtures of Orthogonal Basis Distributions , 2015, Nature Neuroscience.

[10]  A. Haith,et al.  Model-based and model-free mechanisms of human motor learning. , 2013, Advances in experimental medicine and biology.

[11]  Eli Brenner,et al.  Random walk of motor planning in task-irrelevant dimensions. , 2013, Journal of neurophysiology.

[12]  Joseph M. Galea,et al.  Predicting explorative motor learning using decision-making and motor noise , 2017, PLoS Comput. Biol..

[13]  Raymond J. Delnicki,et al.  Persistent Residual Errors in Motor Adaptation Tasks: Reversion to Baseline and Exploratory Escape , 2015, The Journal of Neuroscience.

[14]  Ashesh K Dhawale,et al.  The Role of Variability in Motor Learning. , 2017, Annual review of neuroscience.

[15]  M. Landy,et al.  Dynamic Estimation of Task-Relevant Variance in Movement under Risk , 2012, The Journal of Neuroscience.

[16]  Rajiv Ranganathan,et al.  High variability impairs motor learning regardless of whether it affects task performance , 2017, bioRxiv.

[17]  Luigi Acerbi,et al.  On the Origins of Suboptimality in Human Probabilistic Inference , 2014, PLoS Comput. Biol..

[18]  Joshua G. A. Cashaback,et al.  The human motor system alters its reaching movement plan for task-irrelevant, positional forces. , 2015, Journal of neurophysiology.

[19]  Jeroen B J Smeets,et al.  Reward-Based Motor Adaptation Can Generalize Across Actions , 2019, Journal of experimental psychology. Learning, memory, and cognition.

[20]  J. Krakauer,et al.  Explicit and Implicit Contributions to Learning in a Sensorimotor Adaptation Task , 2014, The Journal of Neuroscience.

[21]  J. Galea,et al.  The relationship between reinforcement and explicit strategies during visuomotor adaptation , 2017, bioRxiv.

[22]  Stephen H. Scott,et al.  Overlap of internal models in motor cortex for mechanical loads during reaching , 2002, Nature.

[23]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[24]  Peter Holland,et al.  Contribution of explicit processes to reinforcement-based motor learning. , 2018, Journal of neurophysiology.

[25]  R A Scheidt,et al.  Learning to move amid uncertainty. , 2001, Journal of neurophysiology.

[26]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[27]  Vincent S. Huang,et al.  Rethinking Motor Learning and Savings in Adaptation Paradigms: Model-Free Memory for Successful Actions Combines with Internal Models , 2011, Neuron.

[28]  Yohsuke R. Miyamoto,et al.  Temporal structure of motor variability is dynamically regulated and predicts motor learning ability , 2014, Nature Neuroscience.

[29]  D. Wolpert,et al.  Effective reinforcement learning following cerebellar damage requires a balance between exploration and motor noise , 2015, Brain : a journal of neurology.

[30]  Paul L. Gribble,et al.  Neural Signatures of Reward and Sensory Prediction Error in Motor Learning , 2018, bioRxiv.

[31]  Richard B. Ivry,et al.  Taking Aim at the Cognitive Side of Learning in Sensorimotor Adaptation Tasks , 2016, Trends in Cognitive Sciences.

[32]  P. Good Permutation, Parametric, and Bootstrap Tests of Hypotheses , 2005 .

[33]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[34]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[35]  Alaa A. Ahmed,et al.  Reward feedback accelerates motor learning. , 2015, Journal of neurophysiology.

[36]  Konrad Paul Kording,et al.  Bayesian integration in sensorimotor learning , 2004, Nature.

[37]  Kang He,et al.  The Statistical Determinants of the Speed of Motor Learning , 2016, PLoS Comput. Biol..

[38]  Sarah E. Pekny,et al.  Reward-Dependent Modulation of Movement Variability , 2015, The Journal of Neuroscience.

[39]  Konrad Paul Kording,et al.  Decision Theory: What "Should" the Nervous System Do? , 2007, Science.

[40]  Jonathan B. Dingwell,et al.  Do Humans Optimally Exploit Redundancy to Control Step Variability in Walking? , 2010, PLoS Comput. Biol..

[41]  Maurice A. Smith,et al.  Environmental Consistency Determines the Rate of Motor Adaptation , 2014, Current Biology.

[42]  Paul L. Gribble,et al.  Dissociating error-based and reinforcement-based loss functions during sensorimotor learning , 2017, PLoS Comput. Biol..

[43]  Konrad Paul Kording,et al.  The dynamics of memory as a consequence of optimal adaptation to a changing body , 2007, Nature Neuroscience.

[44]  Paul L Gribble,et al.  Neural signatures of reward and sensory error feedback processing in motor learning. , 2019, Journal of neurophysiology.

[45]  R. J. Beers,et al.  Motor Learning Is Optimally Tuned to the Properties of Motor Noise , 2009, Neuron.

[46]  Stephanie M. Stalinski,et al.  Journal of Experimental Psychology: Learning, Memory, and Cognition , 2012 .

[47]  Paul L Gribble,et al.  Does the sensorimotor system minimize prediction error or select the most likely prediction during object lifting? , 2017, Journal of neurophysiology.

[48]  D. Wolpert,et al.  Increasing Motor Noise Impairs Reinforcement Learning in Healthy Individuals , 2018, eNeuro.

[49]  Konrad P. Körding,et al.  Uncertainty of Feedback and State Estimation Determines the Speed of Motor Adaptation , 2009, Front. Comput. Neurosci..