Learning Physically Based Humanoid Climbing Movements

We propose a novel learning‐based solution for motion planning of physically‐based humanoid climbing that allows for fast and robust planning of complex climbing strategies and movements, including extreme movements such as jumping. Similar to recent previous work, we combine a high‐level graph‐based path planner with low‐level sampling‐based optimization of climbing moves. We contribute through showing that neural network models of move success probability, effortfulness, and control policy can make both the high‐level and low‐level components more efficient and robust. The models can be trained through random simulation practice without any data. The models also eliminate the need for laboriously hand‐tuned heuristics for graph search. As a result, we are able to efficiently synthesize climbing sequences involving dynamic leaps and one‐hand swings, i.e. there are no limits to the movement complexity or the number of limbs allowed to move simultaneously. Our supplemental video also provides some comparisons between our AI climber and a real human climber.

[1]  Simon X. Yang,et al.  Bioinspired Neural Network for Real-Time Cooperative Hunting by Multirobots in Unknown Environments , 2011, IEEE Transactions on Neural Networks.

[2]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[3]  Zoran Popovic,et al.  Interactive Control of Diverse Complex Characters with Neural Networks , 2015, NIPS.

[4]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[5]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[6]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[7]  Pieter Abbeel,et al.  Meta Learning Shared Hierarchies , 2017, ICLR.

[8]  Xiong Chen,et al.  Recurrent Neural Network for Robot Path Planning , 2004, International Conference on Parallel and Distributed Computing: Applications and Technologies.

[9]  S.X. Yang,et al.  A neural network approach to complete coverage path planning , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[11]  Franck Multon,et al.  Task efficient contact configurations for arbitrary virtual creatures , 2014, Graphics Interface.

[12]  Arild Nøkland,et al.  Shifting Mean Activation Towards Zero with Bipolar Activation Functions , 2017, ICLR.

[13]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[14]  Glen Berseth,et al.  Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[15]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[16]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[17]  Benjamin Recht,et al.  Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.

[18]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[19]  Perttu Hämäläinen,et al.  Augmenting sampling based controllers with machine learning , 2017, Symposium on Computer Animation.

[20]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Jianli Yu,et al.  Neural Networks Based Path Planning and Navigation of Mobile Robots , 2011 .

[22]  Aaron D. Ames,et al.  Traversing Environments Using Possibility Graphs for Humanoid Robots , 2016, ArXiv.

[23]  Jaakko Lehtinen,et al.  Online motion synthesis using sequential Monte Carlo , 2014, ACM Trans. Graph..

[24]  Simon X. Yang,et al.  Neural-Network-Based Path Planning for a Multirobot System With Moving Obstacles , 2009, IEEE Trans. Syst. Man Cybern. Part C.

[25]  Daniele Panozzo,et al.  Position-based tensegrity design , 2017, ACM Trans. Graph..

[26]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[27]  Stan C. A. M. Gielen,et al.  Neural Network Dynamics for Path Planning and Obstacle Avoidance , 1995, Neural Networks.

[28]  Sung Yong Shin,et al.  Planning biped locomotion using motion capture data and probabilistic roadmaps , 2003, TOGS.

[29]  Pieter Abbeel,et al.  Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.

[30]  Vladlen Koltun,et al.  Optimizing locomotion controllers using biologically-based actuators and objectives , 2012, ACM Trans. Graph..

[31]  Gang Liu,et al.  A*Prune: an algorithm for finding K shortest paths subject to multiple constraints , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[32]  Akira Shimada,et al.  A path-planning algorithm for humanoid climbing robot using Kinect sensor , 2014, 2014 Proceedings of the SICE Annual Conference (SICE).

[33]  Nicolas Mansard,et al.  2PAC: Two Point Attractors for Center of Mass Trajectories in Multi Contact Scenarios , 2018 .

[34]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[35]  Yuval Tassa,et al.  An integrated system for real-time model predictive control of humanoid robots , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[36]  C. Karen Liu,et al.  Optimization-based interactive motion synthesis , 2009, ACM Trans. Graph..

[37]  Alain Micaelli,et al.  Transfer of knowledge for a climbing Virtual Human: A reinforcement learning approach , 2009, 2009 IEEE International Conference on Robotics and Automation.

[38]  Aaron Hertzmann,et al.  Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.

[39]  Frédo Durand,et al.  Linear Bellman combination for control of character animation , 2009, ACM Trans. Graph..

[40]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[41]  Timothy Bretl,et al.  Motion Planning of Multi-Limbed Robots Subject to Equilibrium Constraints: The Free-Climbing Robot Problem , 2006, Int. J. Robotics Res..

[42]  M. V. D. Panne,et al.  Sampling-based contact-rich motion control , 2010, ACM Trans. Graph..

[43]  Risto Miikkulainen,et al.  The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities , 2018, Artificial Life.

[44]  Haibin Duan,et al.  Imperialist competitive algorithm optimized artificial neural networks for UCAV global path planning , 2014, Neurocomputing.

[45]  Kourosh Naderi,et al.  Discovering and synthesizing humanoid climbing movements , 2017, ACM Trans. Graph..

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Libin Liu,et al.  Guided Learning of Control Graphs for Physics-Based Characters , 2016, ACM Trans. Graph..

[48]  C. Karen Liu,et al.  Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[49]  Jehee Lee,et al.  Deformable Motion: Squeezing into Cluttered Environments , 2011, Comput. Graph. Forum.