A convex approach to inverse optimal control and its application to modeling human locomotion

Inverse optimal control is the problem of computing a cost function that would have resulted in an observed sequence of decisions. The standard formulation of this problem assumes that decisions are optimal and tries to minimize the difference between what was observed and what would have been observed given a candidate cost function. We assume instead that decisions are only approximately optimal and try to minimize the extent to which observed decisions violate first-order necessary conditions for optimality. For a discrete-time optimal control system with a cost function that is a linear combination of known basis functions, this formulation leads to an efficient method of solution as an unconstrained least-squares problem. We apply this approach to both simulated and experimental data to obtain a simple model of human walking trajectories. This model might subsequently be used either for control of a humanoid robot or for predicting human motion when moving a robot through crowded areas.

[1]  Pieter Abbeel,et al.  Apprenticeship learning and reinforcement learning with application to robotic control , 2008 .

[2]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[3]  Jean-Paul Laumond,et al.  An Optimality Principle Governing Human Walking , 2008, IEEE Transactions on Robotics.

[4]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[5]  Anind K. Dey,et al.  Human Behavior Modeling with Maximum Entropy Inverse Optimal Control , 2009, AAAI Spring Symposium: Human Behavior Modeling.

[6]  Thomas D. Nielsen,et al.  Learning a decision maker's utility function from (possibly) inconsistent behavior , 2004, Artif. Intell..

[7]  J. Casti On the general inverse problem of optimal control theory , 1980 .

[8]  A. Jameson,et al.  Inverse Problem of Linear Optimal Control , 1973 .

[9]  Mario A. Rotea,et al.  New Algorithms for Aircraft Intent Inference and Trajectory Prediction , 2007 .

[10]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[11]  Emanuel Todorov,et al.  Inverse Optimal Control with Linearly-Solvable MDPs , 2010, ICML.

[12]  Jean-Paul Laumond,et al.  From human to humanoid locomotion—an inverse optimal control approach , 2010, Auton. Robots.

[13]  M. Masak An inverse problem on decoupling optimal control systems , 1968 .

[14]  Andreas Krause,et al.  Unfreezing the robot: Navigation in dense, interacting crowds , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Stephen P. Boyd,et al.  Imputing a convex objective function , 2011, 2011 IEEE International Symposium on Intelligent Control.

[16]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[17]  R. E. Kalman,et al.  When Is a Linear Control System Optimal , 1964 .