End-effector control for bipedal locomotion

Biped locomotion can be formulated as goal-directed tasks in the low-dimensional end-effector space, with the upper body and the two feet as end effectors. Based on this observation and the neuroscience hypothesis about hierarchical control in human for tasks such as arm reaching and handwriting, I design a framework for the automatic synthesis of hierarchical controllers for biped locomotion. The controller consists of two components: a per-footstep end-effector path planner at the higher level, and a per-timestep generalized-force solver at the lower level. At the start of each footstep, the planner performs short-term planning in the space of end-effector trajectories. These trajectories at run-time adapt to the interactive task goals and the features of the surrounding uneven terrain in the virtual environment. Using the per-footstep plan, the generalized-force solver takes ground contacts into consideration and solves a quadratic program at each simulation timestep to obtain joint torques that drive the biped. The framework solves for the parameters of the planner and the generalized-force solver for different tasks in offline optimizations. I demonstrate the capabilities of the controllers in navigation tasks where they perform gradual and sharp turns and transition between moving forwards, backwards, and sideways on uneven terrain according to the interactive task goals. I show that the resulting controllers are capable of handling certain morphology changes to the character. To verify that such hierarchical end-effector controllers can potentially be used in real-world mechanical systems, I also show that the controllers are robust to disturbances such as actuator and sensor noise, and changes in the friction coefficient. Because human locomotion is routinely subject to these disturbances, robustness against these disturbances also suggests that the hierarchical control hypothesis about human arm reaching and handwriting control may be a good candidate hypothesis for human locomotion control as well.

[1]  N. A. Bernshteĭn The co-ordination and regulation of movements , 1967 .

[2]  Jun Morimoto,et al.  Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.

[3]  J. Salisbury,et al.  Active stiffness control of a manipulator in cartesian coordinates , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[4]  Yi Gu,et al.  Space-indexed dynamic programming: learning to follow trajectories , 2008, ICML '08.

[5]  Philippe Beaudoin,et al.  Continuation methods for adapting simulated skills , 2008, ACM Trans. Graph..

[6]  Zoran Popovic,et al.  Realistic modeling of bird flight animations , 2003, ACM Trans. Graph..

[7]  Masayuki Inaba,et al.  Footstep planning among obstacles for biped robots , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[8]  David J. Fleet,et al.  Optimizing walking controllers , 2009, ACM Trans. Graph..

[9]  Noah J. Cowan,et al.  Efficient Gradient Estimation for Motor Control Learning , 2002, UAI.

[10]  J. Chestnutt,et al.  Planning Biped Navigation Strategies in Complex Environments , 2003 .

[11]  Jerry E. Pratt,et al.  Virtual model control of a bipedal walking robot , 1997, Proceedings of International Conference on Robotics and Automation.

[12]  H. Sebastian Seung,et al.  Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[13]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, ACM Trans. Graph..

[14]  Oussama Khatib,et al.  Manipulator control at kinematic singularities: a dynamically consistent strategy , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[15]  C. Karen Liu,et al.  Learning physics-based motion style with nonlinear inverse optimization , 2005, ACM Trans. Graph..

[16]  Jun Morimoto,et al.  Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach , 2002, NIPS.

[17]  Michael F. Cohen,et al.  Efficient generation of motion transitions using spacetime constraints , 1996, SIGGRAPH.

[18]  Jovan Popovic,et al.  Multiobjective control with frictional contacts , 2007, SCA '07.

[19]  David C. Brogan,et al.  Animating human athletics , 1995, SIGGRAPH.

[20]  C. T. Farley,et al.  Leg stiffness and stride frequency in human running. , 1996, Journal of biomechanics.

[21]  Martin de Lasa,et al.  Feature-based locomotion controllers , 2010, ACM Trans. Graph..

[22]  Philippe Beaudoin,et al.  Synthesis of constrained walking skills , 2008, SIGGRAPH Asia '08.

[23]  Jessica K. Hodgins,et al.  Animation of dynamic legged locomotion , 1991, SIGGRAPH.

[24]  Sridhar Mahadevan,et al.  Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..

[25]  Michiel van de Panne,et al.  Guided Optimization for Balanced Locomotion , 1995 .

[26]  E. Bizzi,et al.  Human arm trajectory formation. , 1982, Brain : a journal of neurology.

[27]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[28]  E. Todorov Optimality principles in sensorimotor control , 2004, Nature Neuroscience.

[29]  S. Grossberg,et al.  Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties during trajectory formation. , 1988, Psychological review.

[30]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[31]  Zoran Popovic,et al.  Optimal gait and form for animal locomotion , 2009, ACM Trans. Graph..

[32]  Zoran Popovic,et al.  Contact-aware nonlinear control of dynamic characters , 2009, ACM Trans. Graph..

[33]  Francis L. Merat,et al.  Introduction to robotics: Mechanics and control , 1987, IEEE J. Robotics Autom..

[34]  Philippe Beaudoin,et al.  Robust task-based control policies for physics-based characters , 2009, ACM Trans. Graph..

[35]  M. van de Panne,et al.  A controller for the dynamic walk of a biped across variable terrain , 1992, [1992] Proceedings of the 31st IEEE Conference on Decision and Control.

[36]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[37]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[38]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[39]  Jovan Popovic,et al.  Simulation of Human Motion Data using Short‐Horizon Model‐Predictive Control , 2008, Comput. Graph. Forum.

[40]  Eugene Fiume,et al.  Limit cycle control and its application to the animation of balancing and walking , 1996, SIGGRAPH.

[41]  H. Clamann Statistical analysis of motor unit firing patterns in a human skeletal muscle. , 1969, Biophysical journal.

[42]  Geoffrey J. Gordon,et al.  Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[43]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[44]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[45]  Neville Hogan,et al.  Impedance Control: An Approach to Manipulation , 1984, 1984 American Control Conference.

[46]  Timothy Bretl,et al.  Motion Planning for Legged Robots on Varied Terrain , 2008, Int. J. Robotics Res..

[47]  Dinesh Manocha,et al.  Motion planning of human-like robots using constrained coordination , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[48]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[49]  KangKang Yin,et al.  SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..

[50]  Dimitris N. Metaxas,et al.  Automating gait generation , 2001, SIGGRAPH.

[51]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[52]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[53]  B. Bresler The Forces and Moments in the Leg During Level Walking , 1950, Journal of Fluids Engineering.