LQR-trees: Feedback motion planning on sparse randomized trees

Recent advances in the direct computation of Lyapunov functions using convex optimization make it possible to efficiently evaluate regions of stability for smooth nonlinear systems. Here we present a feedback motion planning algorithm which uses these results to efficiently combine locally-valid linear quadratic regulator (LQR) controllers into a nonlinear feedback policy which probabilistically covers the reachable area of a (bounded) state space with a region of stability, certifying that all initial conditions that are capable of reaching the goal will stabilize to the goal. We carefully investigate the algorithm on a two-dimensional model system and discuss the potential for the control of more complicated underactuated control problems like bipedal walking.

[1]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[2]  Matthew T. Mason,et al.  The mechanics of manipulation , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[3]  Anil V. Rao,et al.  Practical Methods for Optimal Control Using Nonlinear Programming , 1987 .

[4]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[5]  Frank L. Lewis,et al.  Applied Optimal Control and Estimation , 1992 .

[6]  O. V. Stryk,et al.  Numerical Solution of Optimal Control Problems by Direct Collocation , 1993 .

[7]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[8]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[9]  Bernard Espiau,et al.  Limit cycles and their stability in a passive bipedal gait , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[10]  Daniel E. Koditschek,et al.  Sequential Composition of Dynamically Dexterous Robot Behaviors , 1999, Int. J. Robotics Res..

[11]  Steven M. LaValle,et al.  Rapidly-Exploring Random Trees: Progress and Prospects , 2000 .

[12]  P. Parrilo Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization , 2000 .

[13]  Tor Arne Johansen,et al.  Computation of Lyapunov functions for smooth nonlinear systems using convex optimization , 2000, Autom..

[14]  Peter J Seiler,et al.  SOSTOOLS: Sum of squares optimization toolbox for MATLAB , 2002 .

[15]  Michael S. Branicky,et al.  Nonlinear and Hybrid Control Via RRTs , 2002 .

[16]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[17]  J. Bobrow,et al.  A Fast Sequential Linear Quadratic Algorithm for Solving Unconstrained Nonlinear Optimal Control Problems , 2005 .

[18]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[19]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[20]  N. Roy,et al.  Towards Feature Selection In Actor-Critic Algorithms , 2007 .

[21]  Eric R. Westervelt,et al.  Analysis results and tools for the control of planar bipedal gaits using hybrid zero dynamics , 2007, Auton. Robots.

[22]  Sridhar Mahadevan,et al.  Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..

[23]  Christopher G. Atkeson,et al.  Random Sampling of States in Dynamic Programming , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Ian R. Manchester,et al.  Can we make a robot ballerina perform a pirouette? Orbital stabilization of periodic motions of underactuated mechanical systems , 2008, Annu. Rev. Control..

[25]  Katie Byl,et al.  Approximate optimal control of the compass gait on rough terrain , 2008, 2008 IEEE International Conference on Robotics and Automation.

[26]  Ian R. Manchester,et al.  Stable Dynamic Walking over Rough Terrain - Theory and Experiment , 2009, ISRR.

[27]  Elena Leah Glassman,et al.  LQR-Based Heuristics for Rapidly Exploring State Space , 2009 .

[28]  Russ Tedrake,et al.  A quadratic regulator-based heuristic for rapidly exploring state space , 2010, 2010 IEEE International Conference on Robotics and Automation.