Stochastic Optimal Control and Estimation Methods Adapted to the Noise Characteristics of the Sensorimotor System

Optimality principles of biological movement are conceptually appealing and straightforward to formulate. Testing them empirically, however, requires the solution to stochastic optimal control and estimation problems for reasonably realistic models of the motor task and the sensorimotor periphery. Recent studies have highlighted the importance of incorporating biologically plausible noise into such models. Here we extend the linear-quadratic-gaussian frameworkcurrently the only framework where such problems can be solved efficientlyto include control-dependent, state-dependent, and internal noise. Under this extended noise model, we derive a coordinate-descent algorithm guaranteed to converge to a feedback control law and a nonadaptive linear estimator optimal with respect to each other. Numerical simulations indicate that convergence is exponential, local minima do not exist, and the restriction to nonadaptive linear estimators has negligible effects in the control problems of interest. The application of the algorithm is illustrated in the context of reaching movements. A Matlab implementation is available at www.cogsci.ucsd.edu/todorov.

[1]  G. Sutton,et al.  The variation of hand tremor with force in healthy subjects , 1967, The Journal of physiology.

[2]  D. Kleinman,et al.  Optimal stationary control of linear systems with control-dependent noise , 1969 .

[3]  D. Jacobson,et al.  Studies of human locomotion via optimal programming , 1971 .

[4]  P. McLane Optimal stochastic control of linear systems with state- and control-dependent disturbances , 1971 .

[5]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[6]  Jan C. Willems,et al.  Feedback stabilizability for stochastic systems with state and control dependent noise , 1976, Autom..

[7]  H. Zelaznik,et al.  Motor-output variability: a theory for the accuracy of rapid motor acts. , 1979, Psychological review.

[8]  Y. Phillis Controller design of systems with multiplicative noise , 1985 .

[9]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[10]  R A Abrams,et al.  Optimality in human motor performance: ideal control of rapid aimed movements. , 1988, Psychological review.

[11]  宇野 洋二,et al.  Formation and control of optimal trajectory in human multijoint arm movement : minimum torque-change model , 1988 .

[12]  P. Whittle Risk-Sensitive Optimal Control , 1990 .

[13]  G E Loeb,et al.  Understanding sensorimotor feedback through optimal control. , 1990, Cold Spring Harbor symposia on quantitative biology.

[14]  C. A. Burbeck,et al.  Two mechanisms for localization? Evidence for separation-dependent and separation-independent processing of position information , 1990, Vision Research.

[15]  Michael A. Arbib,et al.  A computational description of the organization of human reaching and prehension , 1992 .

[16]  A. Bensoussan Stochastic Control of Partially Observable Systems , 1992 .

[17]  A.D. Kuo,et al.  An optimal control model for analyzing human postural balance , 1995, IEEE Transactions on Biomedical Engineering.

[18]  L. Ghaoui State-feedback control of systems with multiplicative noise via linear matrix inequalities , 1995 .

[19]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[20]  D. Whitaker,et al.  Disentangling the Role of Spatial Scale, Separation and Eccentricity in Weber's Law for Position , 1997, Vision Research.

[21]  Domenico D'Alessandro,et al.  Discrete-Time Optimal Control with Control-Dependent Noise and Generalized Riccati Difference Equations , 1998, Autom..

[22]  Daniel M. Wolpert,et al.  Making smooth moves , 2022 .

[23]  Emanuel V. Todrov Studies of goal directed movements , 1998 .

[24]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[25]  Andrew E. B. Lim,et al.  Discrete time LQG controls with control dependent noise , 1999 .

[26]  Hong Huang Stochastic modelling and control of pension plans , 2000 .

[27]  H. Kushner Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[28]  M. Pandy,et al.  Dynamic optimization of human walking. , 2001, Journal of biomechanical engineering.

[29]  Xi Chen,et al.  Solvability and asymptotic behavior of generalized Riccati equations arising in indefinite stochastic LQ controls , 2001, IEEE Trans. Autom. Control..

[30]  Kelvin E. Jones,et al.  Sources of signal-dependent noise during isometric force production. , 2002, Journal of neurophysiology.

[31]  A. Sillito,et al.  Spatial organization and magnitude of orientation contrast interactions in primate V1. , 2002, Journal of neurophysiology.

[32]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[33]  Michael I. Jordan,et al.  A Minimal Intervention Principle for Coordinated Movement , 2002, NIPS.

[34]  Emanuel Todorov,et al.  Cosine Tuning Minimizes Motor Errors , 2002, Neural Computation.

[35]  M. Kawato,et al.  Formation and control of optimal trajectory in human multijoint arm movement , 1989, Biological Cybernetics.

[36]  H. Hatze,et al.  Energy-optimal controls in the mammalian neuromuscular system , 1977, Biological Cybernetics.

[37]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[38]  E. Todorov Optimality principles in sensorimotor control , 2004, Nature Neuroscience.

[39]  W. L. Nelson Physical principles for economies of skilled movements , 1983, Biological Cybernetics.

[40]  Konrad Paul Körding,et al.  The loss function of sensorimotor learning. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[42]  E. Todorov,et al.  Estimation and control of systems with multiplicative noise via linear matrix inequalities , 2005, Proceedings of the 2005, American Control Conference, 2005..

[43]  Tamar Flash,et al.  Motor primitives in vertebrates and invertebrates , 2005, Current Opinion in Neurobiology.

[44]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[45]  Teresa H. Y. Meng,et al.  Optimal estimation of feed-forward-controlled linear systems , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[46]  Emery N. Brown,et al.  A State-Space Analysis for Reconstruction of Goal-Directed Movements Using Neural Signals , 2006, Neural Computation.

[47]  Konrad Paul Kording,et al.  Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .