An Approximate Inference Approach to Temporal Optimization in Optimal Control

Algorithms based on iterative local approximations present a practical approach to optimal control in robotic systems. However, they generally require the temporal parameters (for e.g. the movement duration or the time point of reaching an intermediate goal) to be specified a priori. Here, we present a methodology that is capable of jointly optimizing the temporal parameters in addition to the control command profiles. The presented approach is based on a Bayesian canonical time formulation of the optimal control problem, with the temporal mapping from canonical to real time parametrised by an additional control variable. An approximate EM algorithm is derived that efficiently optimizes both the movement duration and control commands offering, for the first time, a practical approach to tackling generic via point problems in a systematic way under the optimal control framework. The proposed approach, which is applicable to plants with non-linear dynamics as well as arbitrary state dependent and quadratic control costs, is evaluated on realistic simulations of a redundant robotic plant.

[1]  H. Bowen Financial Needs of the Campus , 1970 .

[2]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .

[3]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[4]  D. Breneman,et al.  The costs of higher education , 1980 .

[5]  H. Hansmann,et al.  The Role of Nonprofit Enterprise , 1980 .

[6]  John M. Hollerbach,et al.  Planning of Minimum- Time Trajectories for Robot Arms , 1986 .

[7]  Arthur H. Padilla,et al.  On the Economics of Intercollegiate Athletic Programs , 1987 .

[8]  Robert F. Stengel,et al.  Optimal Control and Estimation , 1994 .

[9]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[10]  D. Marburger Optimal ticket pricing for performance goods , 1997 .

[11]  Zoubin Ghahramani,et al.  Learning Nonlinear Dynamical Systems Using an EM Algorithm , 1998, NIPS.

[12]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[13]  Matthew T. Brown,et al.  Revenue and Wealth Maximization in the National Football League: The Impact of Stadia , 2004 .

[14]  H. Kappen Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.

[15]  C. McEvoy Predicting fund raising revenues in NCAA Division I-A intercollegiate athletics , 2005 .

[16]  D. Mundfrom,et al.  Factors Related To Annual Fund-Raising Contributions from Individual Donors to NCAA Division I-A Institutions , 2005 .

[17]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[18]  N. O'Reilly,et al.  Revenue generation in professional sport: a diagnostic analysis , 2006 .

[19]  Marc Toussaint,et al.  Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[20]  Emanuel Todorov,et al.  Optimal Control Theory , 2006 .

[21]  Weiwei Li,et al.  An Iterative Optimal Control and Estimation Design for Nonlinear Stochastic System , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[22]  Chia-Ju Wu,et al.  A time-scaling method for near-time-optimal control of an omni-directional robot along specified paths , 2008, Artificial Life and Robotics.

[23]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[24]  D. Barber,et al.  Solving deterministic policy ( PO ) MDPs using Expectation-Maximisation and Antifreeze , 2009 .

[25]  Marc Toussaint,et al.  Robot trajectory optimization using approximate inference , 2009, ICML '09.

[26]  Emanuel Todorov,et al.  Compositionality of optimal control laws , 2009, NIPS.

[27]  Carl E. Rasmussen,et al.  Gaussian process dynamic programming , 2009, Neurocomputing.

[28]  Welch Suggs Making money—or not—on college sports , 2009 .

[29]  Takamitsu Matsubara,et al.  Optimal Feedback Control for anthropomorphic manipulators , 2010, 2010 IEEE International Conference on Robotics and Automation.

[30]  Brown T Matthew Financial Management in the Sport Industry , 2010 .