论文信息 - Learning to schedule control fragments for physics-based characters using deep Q-learning

Learning to schedule control fragments for physics-based characters using deep Q-learning

Given a robust control system, physical simulation offers the potential for interactive human characters that move in realistic and responsive ways. In this article, we describe how to learn a scheduling scheme that reorders short control fragments as necessary at runtime to create a control system that can respond to disturbances and allows steering and other user interactions. These schedulers provide robust control of a wide range of highly dynamic behaviors, including walking on a ball, balancing on a bongo board, skateboarding, running, push-recovery, and breakdancing. We show that moderate-sized Q-networks can model the schedulers for these control tasks effectively and that those schedulers can be efficiently learned by the deep Q-learning algorithm.

Libin Liu | Jessica K. Hodgins

[1] Sehoon Ha,et al. Iterative Training of Dynamic Skills Inspired by Human Coaching Techniques , 2014, ACM Trans. Graph..

[2] Zoran Popovic,et al. Interactive Control of Diverse Complex Characters with Neural Networks , 2015, NIPS.

[3] Zoran Popovic,et al. Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[4] Jun Morimoto,et al. Learning from demonstration and adaptation of biped locomotion , 2004, Robotics Auton. Syst..

[5] Baining Guo,et al. Simulation and control of skeleton-driven soft body characters , 2013, ACM Trans. Graph..

[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[7] C. Karen Liu,et al. Optimal feedback control for character animation using an abstract model , 2010, SIGGRAPH 2010.

[8] C. Karen Liu,et al. Controlling physics-based characters using soft contacts , 2011, ACM Trans. Graph..

[9] Aaron Hertzmann,et al. Active learning for real-time motion controllers , 2007, SIGGRAPH 2007.

[10] Glen Berseth,et al. Dynamic terrain traversal skills using reinforcement learning , 2015, ACM Trans. Graph..

[11] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.

[12] Christopher G. Atkeson,et al. Biped walking control using a trajectory library , 2013, Robotica.

[13] Jehee Lee,et al. Simulating biped behaviors from human motion data , 2007, SIGGRAPH 2007.

[14] Taesoo Kwon,et al. Control systems for human running using an inverted pendulum model and a reference motion capture sequence , 2010, SCA '10.

[15] Baining Guo,et al. Improving Sampling‐based Motion Control , 2015, Comput. Graph. Forum.

[16] Stefan Schaal,et al. Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.

[17] Zoran Popović,et al. Contact-aware nonlinear control of dynamic characters , 2009, SIGGRAPH 2009.

[18] Philippe Beaudoin,et al. Generalized biped walking control , 2010, SIGGRAPH 2010.

[19] Nicolas Pronost,et al. Interactive Character Animation Using Simulated Physics: A State‐of‐the‐Art Review , 2012, Comput. Graph. Forum.

[20] Jehee Lee,et al. Data-driven biped control , 2010, SIGGRAPH 2010.

[21] Victor B. Zordan,et al. Momentum control for balance , 2009, SIGGRAPH 2009.

[22] Aaron Hertzmann,et al. Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.

[23] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.

[24] M. V. D. Panne,et al. SIMBICON: simple biped locomotion control , 2007, SIGGRAPH 2007.

[25] Sergey Levine,et al. Guided Policy Search , 2013, ICML.

[26] Frédo Durand,et al. Linear Bellman combination for control of character animation , 2009, SIGGRAPH 2009.

[27] Aaron Hertzmann,et al. Robust physics-based locomotion using low-dimensional planning , 2010, SIGGRAPH 2010.

[28] Jovan Popovic,et al. Simulating 2D Gaits with a Phase-Indexed Tracking Controller , 2011, IEEE Computer Graphics and Applications.

[29] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..

[30] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[31] Tianjia Shao,et al. Sampling-based contact-rich motion control , 2010, SIGGRAPH 2010.

[32] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[33] Eugene Fiume,et al. Feedback control for rotational movements in feature space , 2014, Comput. Graph. Forum.

[34] Victor B. Zordan,et al. Control of Rotational Dynamics for Ground and Aerial Behavior , 2014, IEEE Transactions on Visualization and Computer Graphics.

[35] Adrien Treuille,et al. Near-optimal character animation with continuous control , 2007, SIGGRAPH 2007.

[36] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[37] C. Karen Liu,et al. Stable Proportional-Derivative Controllers , 2011, IEEE Computer Graphics and Applications.

[38] Philippe Beaudoin,et al. Robust task-based control policies for physics-based characters , 2009, SIGGRAPH 2009.

[39] Jessy W. Grizzle,et al. Preliminary walking experiments with underactuated 3D bipedal robot MARLO , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40] K. Wampler,et al. Optimal gait and form for animal locomotion , 2009, SIGGRAPH 2009.

[41] Libin Liu,et al. Guided Learning of Control Graphs for Physics-Based Characters , 2016, ACM Trans. Graph..

[42] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[43] C. Karen Liu,et al. Learning bicycle stunts , 2014, ACM Trans. Graph..

[44] C. Karen Liu,et al. Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[45] Nancy S. Pollard,et al. Responsive characters from motion fragments , 2007, SIGGRAPH 2007.

[46] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[47] Taesoo Kwon,et al. Momentum-Mapped Inverted Pendulum Models for Controlling Dynamic Human Motions , 2017, ACM Trans. Graph..

[48] David J. Fleet,et al. Optimizing walking controllers for uncertain inputs and environments , 2010, SIGGRAPH 2010.

[49] Zoran Popović,et al. Motion fields for interactive character locomotion , 2010, SIGGRAPH 2010.