Learning bicycle stunts

We present a general approach for simulating and controlling a human character that is riding a bicycle. The two main components of our system are offline learning and online simulation. We simulate the bicycle and the rider as an articulated rigid body system. The rider is controlled by a policy that is optimized through offline learning. We apply policy search to learn the optimal policies, which are parameterized with splines or neural networks for different bicycle maneuvers. We use Neuroevolution of Augmenting Topology (NEAT) to optimize both the parametrization and the parameters of our policies. The learned controllers are robust enough to withstand large perturbations and allow interactive user control. The rider not only learns to steer and to balance in normal riding situations, but also learns to perform a wide variety of stunts, including wheelie, endo, bunny hop, front wheel pivot and back hop.

[1]  D. E. Jones The stability of the bicycle , 1970 .

[2]  Jessica K. Hodgins,et al.  Generating natural-looking motion for computer animation , 1992 .

[3]  Michiel van de Panne,et al.  Sensor-actuator networks , 1993, SIGGRAPH.

[4]  Joe Marks,et al.  Spacetime constraints revisited , 1993, SIGGRAPH.

[5]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[6]  Karl Sims,et al.  Evolving virtual creatures , 1994, SIGGRAPH.

[7]  David C. Brogan,et al.  Animating human athletics , 1995, SIGGRAPH.

[8]  Alex S. Fukunaga,et al.  Further experience with controller-based automatic motion synthesis for articulated figures , 1995, TOGS.

[9]  Demetri Terzopoulos,et al.  Automated learning of muscle-actuated locomotion through control abstraction , 1995, SIGGRAPH.

[10]  Eugene Fiume,et al.  Limit cycle control and its application to the animation of balancing and walking , 1996, SIGGRAPH.

[11]  Preben Alstrøm,et al.  Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.

[12]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[13]  Sebastian Thrun,et al.  Issues in Using Function Approximation for Reinforcement Learning , 1999 .

[14]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[15]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[16]  Chee-Meng Chew,et al.  Virtual Model Control: An Intuitive Approach for Bipedal Locomotion , 2001, Int. J. Robotics Res..

[17]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[18]  PopovićZoran,et al.  Realistic modeling of bird flight animations , 2003 .

[19]  Zoran Popovic,et al.  Realistic modeling of bird flight animations , 2003, ACM Trans. Graph..

[20]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[21]  Peng Zhao,et al.  User interfaces for interactive control of physics-based 3D characters , 2005, I3D '05.

[22]  Z. Popovic,et al.  Near-optimal character animation with continuous control , 2007, ACM Trans. Graph..

[23]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[24]  Adrien Treuille,et al.  Near-optimal character animation with continuous control , 2007, SIGGRAPH 2007.

[25]  KangKang Yin,et al.  SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..

[26]  山田 祐,et al.  Open Dynamics Engine を用いたスノーボードロボットシミュレータの開発 , 2007 .

[27]  Arend L. Schwab,et al.  Linearized dynamics equations for the balance and steer of a bicycle: a benchmark and review , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[28]  Philippe Beaudoin,et al.  Continuation methods for adapting simulated skills , 2008, ACM Trans. Graph..

[29]  Marco da Silva,et al.  Interactive simulation of stylized human locomotion , 2008, ACM Trans. Graph..

[30]  Christian Igel,et al.  Evolution Strategies for Direct Policy Search , 2008, PPSN.

[31]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[32]  Petros Faloutsos,et al.  Evolved Controllers for Simulated Locomotion , 2009, MIG.

[33]  Philippe Beaudoin,et al.  Robust task-based control policies for physics-based characters , 2009, ACM Trans. Graph..

[34]  Aaron Hertzmann,et al.  Prioritized optimization for task-space control , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[35]  David J. Fleet,et al.  Optimizing walking controllers , 2009, ACM Trans. Graph..

[36]  Zoran Popovic,et al.  Contact-aware nonlinear control of dynamic characters , 2009, ACM Trans. Graph..

[37]  Axel Cleeremans,et al.  Implicit motor learning in discrete and continuous tasks: Toward a possible account of discrepant results , 2009 .

[38]  David J. Fleet,et al.  Optimizing walking controllers for uncertain inputs and environments , 2010, ACM Trans. Graph..

[39]  M. van de Panne,et al.  Generalized biped walking control , 2010, ACM Trans. Graph..

[40]  Tong-Yee Lee,et al.  Real-Time Physics-Based 3D Biped Character Animation Using an Inverted Pendulum Model , 2010, IEEE Transactions on Visualization and Computer Graphics.

[41]  Taesoo Kwon,et al.  Control systems for human running using an inverted pendulum model and a reference motion capture sequence , 2010, SCA '10.

[42]  Aaron Hertzmann,et al.  Feature-based locomotion controllers , 2010, SIGGRAPH 2010.

[43]  Martin de Lasa,et al.  Robust physics-based locomotion using low-dimensional planning , 2010, ACM Trans. Graph..

[44]  C. K. Liu,et al.  Optimal feedback control for character animation using an abstract model , 2010, ACM Trans. Graph..

[45]  A. Karpathy,et al.  Locomotion skills for simulated quadrupeds , 2011, SIGGRAPH 2011.

[46]  A. Ruina,et al.  A Bicycle Can Be Self-Stable Without Gyroscopic or Caster Effects , 2011, Science.

[47]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[48]  C. Karen Liu,et al.  Articulated swimming creatures , 2011, ACM Trans. Graph..

[49]  Sergey Levine,et al.  Continuous character control with low-dimensional embeddings , 2012, ACM Trans. Graph..

[50]  C. Karen Liu,et al.  Soft body locomotion , 2012, ACM Trans. Graph..

[51]  Nicolas Pronost,et al.  Interactive Character Animation Using Simulated Physics: A State‐of‐the‐Art Review , 2012, Comput. Graph. Forum.

[52]  Vladlen Koltun,et al.  Optimizing locomotion controllers using biologically-based actuators and objectives , 2012, ACM Trans. Graph..

[53]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[54]  Sheldon Andrews,et al.  Goal directed multi-finger manipulation: Control policies and analysis , 2013, Comput. Graph..

[55]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[56]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.