A Model of Reward- and Effort-Based Optimal Decision Making and Motor Control

Costs (e.g. energetic expenditure) and benefits (e.g. food) are central determinants of behavior. In ecology and economics, they are combined to form a utility function which is maximized to guide choices. This principle is widely used in neuroscience as a normative model of decision and action, but current versions of this model fail to consider how decisions are actually converted into actions (i.e. the formation of trajectories). Here, we describe an approach where decision making and motor control are optimal, iterative processes derived from the maximization of the discounted, weighted difference between expected rewards and foreseeable motor efforts. The model accounts for decision making in cost/benefit situations, and detailed characteristics of control and goal tracking in realistic motor tasks. As a normative construction, the model is relevant to address the neural bases and pathological aspects of decision making and motor control.

[1]  H. Raiffa,et al.  Introduction to Statistical Decision Theory , 1996 .

[2]  Michael S Brainard,et al.  Lesions of an avian basal ganglia circuit prevent context-dependent changes to song variability. , 2006, Journal of neurophysiology.

[3]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .

[4]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[5]  M. Walton,et al.  Calculating utility: preclinical evidence for cost–benefit analysis by mesolimbic dopamine , 2007, Psychopharmacology.

[6]  Konrad Paul Kording,et al.  Decision Theory: What "Should" the Nervous System Do? , 2007, Science.

[7]  R. Shadmehr,et al.  Temporal Discounting of Reward and the Cost of Time in Motor Control , 2010, The Journal of Neuroscience.

[8]  J. Krakauer,et al.  Why Don't We Move Faster? Parkinson's Disease, Movement Vigor, and Implicit Motivation , 2007, The Journal of Neuroscience.

[9]  L. Chelazzi,et al.  The urgency to look: Prompt saccades to the benefit of perception , 2005, Vision Research.

[10]  J. Krakauer,et al.  A computational neuroanatomy for motor control , 2008, Experimental Brain Research.

[11]  R. Stengel Stochastic Optimal Control: Theory and Application , 1986 .

[12]  V J Brown,et al.  Discriminative Cues Indicating Reward Magnitude Continue to Determine Reaction Time of Rats Following Lesions of the Nucleus Accumbens , 1995, The European journal of neuroscience.

[13]  J. Salamone,et al.  Anhedonia or anergia? Effects of haloperidol and nucleus accumbens dopamine depletion on instrumental response selection in a T-maze cost/benefit procedure , 1994, Behavioural Brain Research.

[14]  P. Fitts The information capacity of the human motor system in controlling the amplitude of movement. , 1954, Journal of experimental psychology.

[15]  M. Walton,et al.  Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine , 2009, Nature Neuroscience.

[16]  M. Roesch,et al.  Ventral Striatal Neurons Encode the Value of the Chosen Action in Rats Deciding between Differently Delayed or Sized Rewards , 2009, The Journal of Neuroscience.

[17]  A. Tversky,et al.  Prospect theory: analysis of decision under risk , 1979 .

[18]  F. P. de Lange,et al.  Motor imagery of gait: a quantitative approach , 2007, Experimental Brain Research.

[19]  O. Hikosaka,et al.  Eye movements in monkeys with local dopamine depletion in the caudate nucleus. I. Deficits in spontaneous saccades , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[20]  H Hefter,et al.  Disturbances in human arm movement trajectory due to mild cerebellar dysfunction. , 1990, Journal of neurology, neurosurgery, and psychiatry.

[21]  J. Bradshaw,et al.  Impairments of movement kinematics in patients with Huntington's disease: A comparison with and without a concurrent task , 1997, Movement disorders : official journal of the Movement Disorder Society.

[22]  Emmanuel Guigon,et al.  Optimality, stochasticity, and variability in motor behavior , 2008, Journal of Computational Neuroscience.

[23]  R. Shadmehr,et al.  The intrinsic value of visual information affects saccade velocities , 2009, Experimental Brain Research.

[24]  R. Dolan,et al.  Dopamine and Effort-Based Decision Making , 2011, Front. Neurosci..

[25]  L. Crespi Quantitative variation of incentive and performance in the white rat. , 1942 .

[26]  Kenji Doya,et al.  Humans Can Adopt Optimal Discounting Strategy under Real-Time Constraints , 2006, PLoS Comput. Biol..

[27]  Bruce Hoff,et al.  A model of duration in normal and perturbed reaching movement , 1994, Biological Cybernetics.

[28]  Peter D Balsam,et al.  Dopamine D1 and D2 antagonist effects on response likelihood and duration. , 2009, Behavioral neuroscience.

[29]  L. Maloney,et al.  Economic decision-making compared with an equivalent motor task , 2009, Proceedings of the National Academy of Sciences.

[30]  H. Weinert,et al.  Bryson, A. E./ Ho, Y.-C., Applied Optimal Control, Optimization, Estimation, and Control. New York-London-Sydney-Toronto. John Wiley & Sons. 1975. 481 S., £10.90 , 1979 .

[31]  Daniel M. Wolpert,et al.  Signal-dependent noise determines motor planning , 1998, Nature.

[32]  M. Platt,et al.  Risky business: the neuroeconomics of decision making under uncertainty , 2008, Nature Neuroscience.

[33]  Ning Qian,et al.  An optimization principle for determining movement duration. , 2006, Journal of neurophysiology.

[34]  F. Zajac Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. , 1989, Critical reviews in biomedical engineering.

[35]  M. Kawato,et al.  Optimal impedance control for task achievement in the presence of signal-dependent noise. , 2004, Journal of neurophysiology.

[36]  Daniel A. Braun,et al.  Risk-Sensitive Optimal Feedback Control Accounts for Sensorimotor Behavior under Uncertainty , 2010, PLoS Comput. Biol..

[37]  Frans C. T. van der Helm,et al.  Musculoskeletal Systems with Intrinsic and Proprioceptive Feedback , 2000 .

[38]  Daniel M Corcos,et al.  Effect of medication on EMG patterns in individuals with Parkinson's disease , 2002, Movement disorders : official journal of the Movement Disorder Society.

[39]  Emanuel Todorov,et al.  Stochastic Optimal Control and Estimation Methods Adapted to the Noise Characteristics of the Sensorimotor System , 2005, Neural Computation.

[40]  S. Reader,et al.  Spatial Discounting of Food and Social Rewards in Guppies (Poecilia Reticulata) , 2011, Front. Psychology.

[41]  Matthew F. S. Rushworth,et al.  Weighing up the benefits of work: Behavioral and neural analyses of effort-related decision making , 2006, Neural Networks.

[42]  M. Gentilucci,et al.  Effects of disease progression and L-dopa therapy on the control of reaching-grasping in Parkinson's disease , 2005, Neuropsychologia.

[43]  M. Jahanshahi,et al.  Willed action and its impairments. , 1998, Cognitive neuropsychology.

[44]  P. Dayan,et al.  Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[45]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[46]  S. Floresco,et al.  Differential effects on effort discounting induced by inactivations of the nucleus accumbens core or shell. , 2010, Behavioral neuroscience.

[47]  Peter J. Beek,et al.  Impedance is modulated to meet accuracy demands during goal-directed arm movements , 2006, Experimental Brain Research.

[48]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[49]  J. Flanagan,et al.  The reaching movements of patients with Parkinson's disease under self-determined maximal speed and visually cued conditions. , 1998, Brain : a journal of neurology.

[50]  R. Wurtz,et al.  Modification of saccadic eye movements by GABA-related substances. II. Effects of muscimol in monkey substantia nigra pars reticulata. , 1985, Journal of neurophysiology.

[51]  Lionel Rigoux,et al.  Learning cost-efficient control policies with XCSF: generalization capabilities and further improvement , 2011, GECCO '11.

[52]  M. Walton,et al.  Separate neural pathways process different decision costs , 2006, Nature Neuroscience.

[53]  Emanuel Todorov,et al.  Evidence for the Flexible Sensorimotor Strategies Predicted by Optimal Feedback Control , 2007, The Journal of Neuroscience.

[54]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[55]  F A Mussa-Ivaldi,et al.  Adaptive representation of dynamics during learning of a motor task , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[56]  E. Todorov Optimality principles in sensorimotor control , 2004, Nature Neuroscience.

[57]  O. Hikosaka,et al.  Eye movements in monkeys with local dopamine depletion in the caudate nucleus. II. Deficits in voluntary saccades , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[58]  O. Bock Load compensation in human goal-directed arm movements , 1990, Behavioural Brain Research.

[59]  M. Pessiglione,et al.  Get Aroused and Be Stronger: Emotional Facilitation of Physical Effort in the Human Brain , 2009, The Journal of Neuroscience.

[60]  R. Levy,et al.  Apathy and the basal ganglia , 2006, Journal of Neurology.

[61]  Mathias Pessiglione,et al.  Separate Valuation Subsystems for Delay and Effort Decision Costs , 2010, The Journal of Neuroscience.

[62]  S. Scott Optimal feedback control and the neural basis of volitional motor control , 2004, Nature Reviews Neuroscience.

[63]  Emanuel Todorov,et al.  Optimal Control Theory , 2006 .

[64]  N. Chater,et al.  Choosing to Make an Effort: The Role of Striatum in Signaling Physical Effort of a Chosen Action , 2010, Journal of neurophysiology.

[65]  M. Landy,et al.  Decision making, movement planning and statistical decision theory , 2008, Trends in Cognitive Sciences.

[66]  J. Salamone,et al.  Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits , 2007, Psychopharmacology.

[67]  M. Alamy,et al.  A defective control of small-amplitude movements in monkeys with globus pallidus lesions: an experimental study on one component of pallidal bradykinesia , 1995, Behavioural Brain Research.

[68]  Okihide Hikosaka,et al.  Effects of motivational conflicts on visually elicited saccades in monkeys , 2003, Experimental Brain Research.

[69]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[70]  Sarah H. Creem-Regehr,et al.  Evidence for motor simulation in imagined locomotion. , 2009, Journal of experimental psychology. Human perception and performance.

[71]  Mathias Pessiglione,et al.  Disconnecting force from money: effects of basal ganglia damage on incentive motivation. , 2008, Brain : a journal of neurology.

[72]  M. Landy,et al.  Statistical decision theory and trade-offs in the control of motor response. , 2003, Spatial vision.

[73]  Jeffrey R. Stevens,et al.  Will Travel for Food: Spatial Discounting in Two New World Monkeys , 2005, Current Biology.

[74]  S. Nicola The Flexible Approach Hypothesis: Unification of Effort and Cue-Responding Hypotheses for the Role of Nucleus Accumbens Dopamine in the Activation of Reward-Seeking Behavior , 2010, The Journal of Neuroscience.

[75]  Rajesh P. N. Rao,et al.  Bayesian brain : probabilistic approaches to neural coding , 2006 .

[76]  H. Aarts,et al.  Preparing and Motivating Behavior Outside of Awareness , 2008, Science.

[77]  Emmanuel Guigon,et al.  Computational Motor Control : Redundancy and Invariance , 2007 .

[78]  Emmanuel Guigon,et al.  Coding of movement‐ and force‐related information in primate primary motor cortex: a computational approach , 2007, The European journal of neuroscience.

[79]  Heinrich H. Bülthoff,et al.  The quick and the dead: when reaction beats intention , 2010, Proceedings of the Royal Society B: Biological Sciences.

[80]  H Hefter,et al.  Basal ganglia and cerebellar impairment differentially affect the amplitude and time scaling during the performance of forearm step tracking movements. , 1996, Electromyography and clinical neurophysiology.

[81]  J. Laumond,et al.  The formation of trajectories during goal‐oriented locomotion in humans. I. A stereotyped behaviour , 2007, The European journal of neuroscience.

[82]  M. Sanders,et al.  The generality of Fitts's law. , 1972, Journal of experimental psychology.

[83]  S. Floresco,et al.  Dopaminergic and Glutamatergic Regulation of Effort- and Delay-Based Decision Making , 2008, Neuropsychopharmacology.

[84]  Michael S Landy,et al.  Statistical decision theory and the selection of rapid, goal-directed movements. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[85]  M. Landy,et al.  Optimal Compensation for Changes in Task-Relevant Movement Variability , 2005, The Journal of Neuroscience.

[86]  W. Press,et al.  Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .

[87]  M. Walton,et al.  Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort , 2005, Psychopharmacology.

[88]  M. Desmurget,et al.  Computational motor control: feedback and accuracy , 2008, The European journal of neuroscience.

[89]  Daniel A. Braun,et al.  Risk-Sensitivity in Sensorimotor Control , 2011, Front. Hum. Neurosci..

[90]  Mark Dean,et al.  Trading off speed and accuracy in rapid, goal-directed movements. , 2007, Journal of vision.

[91]  Emmanuel Guigon,et al.  Active control of bias for the control of posture and movement. , 2010, Journal of neurophysiology.

[92]  Daniel M. Wolpert,et al.  Forward Models for Physiological Motor Control , 1996, Neural Networks.

[93]  M. Desmurget,et al.  Basal ganglia contributions to motor control: a vigorous tutor , 2010, Current Opinion in Neurobiology.

[95]  A. Tversky,et al.  Prospect theory: an analysis of decision under risk — Source link , 2007 .

[96]  R. Dolan,et al.  How the Brain Translates Money into Force: A Neuroimaging Study of Subliminal Motivation , 2007, Science.

[97]  P. Cisek,et al.  The influence of predicted arm biomechanics on decision making. , 2011, Journal of neurophysiology.

[98]  Reza Shadmehr,et al.  Computational nature of human adaptive control during learning of reaching movements in force fields , 1999, Biological Cybernetics.

[99]  Peter Kunkel,et al.  Numerical Solution of Infinite-Horizon Optimal-Control Problems , 2000 .

[100]  Peter W Battaglia,et al.  Humans Trade Off Viewing Time and Movement Duration to Improve Visuomotor Accuracy in a Fast Reaching Task , 2007, The Journal of Neuroscience.

[101]  Alex Simpkins,et al.  Practical numerical methods for stochastic optimal control of biological systems in continuous time and space , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[102]  Jan Peters,et al.  The neural mechanisms of inter-temporal decision-making: understanding variability , 2011, Trends in Cognitive Sciences.

[103]  Daniel M. Wolpert,et al.  The Main Sequence of Saccades Optimizes Speed-accuracy Trade-off , 2006, Biological Cybernetics.

[104]  Joel Myerson,et al.  Exponential Versus Hyperbolic Discounting of Delayed Outcomes: Risk and Waiting Time , 1996 .

[105]  Gerald L. Gottlieb,et al.  Fatigue induced changes in phasic muscle activation patterns for fast elbow flexion movements , 2001, Experimental Brain Research.

[106]  R A Abrams,et al.  Optimality in human motor performance: ideal control of rapid aimed movements. , 1988, Psychological review.

[107]  R. Wurtz,et al.  Modification of saccadic eye movements by GABA-related substances. I. Effect of muscimol and bicuculline in monkey superior colliculus. , 1985, Journal of neurophysiology.

[108]  M. Jeannerod,et al.  The timing of mentally represented actions , 1989, Behavioural Brain Research.

[109]  Michel Desmurget,et al.  “Paradoxical Kinesis” is not a Hallmark of Parkinson's disease but a general property of the motor system , 2006, Movement disorders : official journal of the Movement Disorder Society.

[110]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.