Learning predict-and-simulate policies from unorganized human motion data

The goal of this research is to create physically simulated biped characters equipped with a rich repertoire of motor skills. The user can control the characters interactively by modulating their control objectives. The characters can interact physically with each other and with the environment. We present a novel network-based algorithm that learns control policies from unorganized, minimally-labeled human motion data. The network architecture for interactive character animation incorporates an RNN-based motion generator into a DRL-based controller for physics simulation and control. The motion generator guides forward dynamics simulation by feeding a sequence of future motion frames to track. The rich future prediction facilitates policy learning from large training data sets. We will demonstrate the effectiveness of our approach with biped characters that learn a variety of dynamic motor skills from large, unorganized data and react to unexpected perturbation beyond the scope of the training data.

[1]  Sergey Levine,et al.  Continuous character control with low-dimensional embeddings , 2012, ACM Trans. Graph..

[2]  Christopher Joseph Pal,et al.  Recurrent transition networks for character locomotion , 2018, SIGGRAPH Asia Technical Briefs.

[3]  Sehoon Ha,et al.  Iterative Training of Dynamic Skills Inspired by Human Coaching Techniques , 2014, ACM Trans. Graph..

[4]  Yuval Tassa,et al.  Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[5]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[6]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[7]  Jehee Lee,et al.  Simulating biped behaviors from human motion data , 2007, SIGGRAPH 2007.

[8]  Kyoungmin Lee,et al.  Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..

[9]  Jun-yong Noh,et al.  Data-driven control of flapping flight , 2013, TOGS.

[10]  Victor B. Zordan,et al.  Interactive dynamic response for games , 2007, Sandbox '07.

[11]  Taesoo Kwon,et al.  Control systems for human running using an inverted pendulum model and a reference motion capture sequence , 2010, SCA '10.

[12]  KangKang Yin,et al.  SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..

[13]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[14]  Tong-Yee Lee,et al.  Real-Time Physics-Based 3D Biped Character Animation Using an Inverted Pendulum Model , 2010, IEEE Transactions on Visualization and Computer Graphics.

[15]  Aaron Hertzmann,et al.  Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.

[16]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[17]  M. V. D. Panne,et al.  SIMBICON: simple biped locomotion control , 2007, SIGGRAPH 2007.

[18]  C. Karen Liu,et al.  Optimal feedback control for character animation using an abstract model , 2010, ACM Trans. Graph..

[19]  Dinesh K. Pai,et al.  Interaction capture and synthesis , 2005, SIGGRAPH 2005.

[20]  Jinxiang Chai,et al.  Motion graphs++ , 2012, ACM Trans. Graph..

[21]  Baining Guo,et al.  Terrain runner , 2012, ACM Trans. Graph..

[22]  Libin Liu,et al.  Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning , 2018, ACM Trans. Graph..

[23]  Jernej Barbic,et al.  Deformable object animation using reduced optimal control , 2009, ACM Trans. Graph..

[24]  C. Karen Liu,et al.  Optimal feedback control for character animation using an abstract model , 2010, SIGGRAPH 2010.

[25]  Yuval Tassa,et al.  Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.

[26]  LeeJehee,et al.  Learning predict-and-simulate policies from unorganized human motion data , 2019 .

[27]  C. Karen Liu,et al.  Stable Proportional-Derivative Controllers , 2011, IEEE Computer Graphics and Applications.

[28]  Taku Komura,et al.  Phase-functioned neural networks for character control , 2017, ACM Trans. Graph..

[29]  Jehee Lee,et al.  Interactive character animation by learning multi-objective control , 2018, ACM Trans. Graph..

[30]  Daniel Holden,et al.  DReCon: data-driven responsive control of physics-based characters , 2019, ACM Trans. Graph..

[31]  Yi Zhou,et al.  Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis , 2017, ICLR.

[32]  Glen Berseth,et al.  DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning , 2017, ACM Trans. Graph..

[33]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[34]  Jakub W. Pachocki,et al.  Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[35]  Jehee Lee,et al.  Motion Grammars for Character Animation , 2016, Comput. Graph. Forum.

[36]  Taku Komura,et al.  A deep learning framework for character motion synthesis and editing , 2016, ACM Trans. Graph..

[37]  Jitendra Malik,et al.  Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Taku Komura,et al.  Mode-adaptive neural networks for quadruped motion control , 2018, ACM Trans. Graph..

[39]  Hyun Joon Shin,et al.  Motion synthesis and editing in low‐dimensional spaces , 2006, Comput. Animat. Virtual Worlds.

[40]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[41]  Stefan Jeschke,et al.  Physics-based motion capture imitation with deep reinforcement learning , 2018, MIG.

[42]  Libin Liu,et al.  Guided Learning of Control Graphs for Physics-Based Characters , 2016, ACM Trans. Graph..

[43]  C. Karen Liu,et al.  Learning bicycle stunts , 2014, ACM Trans. Graph..

[44]  Taesoo Kwon,et al.  Momentum-Mapped Inverted Pendulum Models for Controlling Dynamic Human Motions , 2017, ACM Trans. Graph..

[45]  Yee Whye Teh,et al.  Neural probabilistic motor primitives for humanoid control , 2018, ICLR.

[46]  Katsu Yamane,et al.  Editing dynamic human motions via momentum and force , 2010, SCA '10.

[47]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, SIGGRAPH 2004.

[48]  Jehee Lee,et al.  Simulating biped behaviors from human motion data , 2007, ACM Trans. Graph..

[49]  C. Karen Liu,et al.  Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[50]  Sehoon Ha,et al.  Falling and landing motion control for character animation , 2012, ACM Trans. Graph..

[51]  Jaakko Lehtinen,et al.  Online motion synthesis using sequential Monte Carlo , 2014, ACM Trans. Graph..

[52]  Jungdam Won,et al.  How to train your dragon , 2017, ACM Trans. Graph..

[53]  David J. Fleet,et al.  Optimizing walking controllers for uncertain inputs and environments , 2010, SIGGRAPH 2010.

[54]  Zoran Popović,et al.  Motion fields for interactive character locomotion , 2010, SIGGRAPH 2010.

[55]  Adrien Treuille,et al.  Near-optimal character animation with continuous control , 2007, SIGGRAPH 2007.

[56]  Jehee Lee,et al.  Synchronized multi-character motion editing , 2009, ACM Trans. Graph..

[57]  Otmar Hilliges,et al.  Learning Human Motion Models for Long-Term Predictions , 2017, 2017 International Conference on 3D Vision (3DV).

[58]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[59]  Siddhartha S. Srinivasa,et al.  DART: Dynamic Animation and Robotics Toolkit , 2018, J. Open Source Softw..

[60]  Jessica K. Hodgins,et al.  Motion capture-driven simulations that hit and react , 2002, SCA '02.

[61]  Jungdam Won,et al.  Aerobatics control of flying creatures via self-regulated learning , 2018, ACM Trans. Graph..

[62]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[63]  Vladlen Koltun,et al.  Optimizing locomotion controllers using biologically-based actuators and objectives , 2012, ACM Trans. Graph..

[64]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[65]  Nando de Freitas,et al.  Robust Imitation of Diverse Behaviors , 2017, NIPS.

[66]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[67]  Sungeun Kim,et al.  Data-driven biped control , 2010, ACM Trans. Graph..

[68]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, ACM Trans. Graph..

[69]  Jehee Lee,et al.  Precomputing avatar behavior from human motion data , 2004, SCA '04.