Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning

Basketball is one of the world's most popular sports because of the agility and speed demonstrated by the players. This agility and speed makes designing controllers to realize robust control of basketball skills a challenge for physics-based character animation. The highly dynamic behaviors and precise manipulation of the ball that occur in the game are difficult to reproduce for simulated players. In this paper, we present an approach for learning robust basketball dribbling controllers from motion capture data. Our system decouples a basketball controller into locomotion control and arm control components and learns each component separately. To achieve robust control of the ball, we develop an efficient pipeline based on trajectory optimization and deep reinforcement learning and learn non-linear arm control policies. We also present a technique for learning skills and the transition between skills simultaneously. Our system is capable of learning robust controllers for various basketball dribbling skills, such as dribbling between the legs and crossover moves. The resulting control graphs enable a simulated player to perform transitions between these skills and respond to user interaction.

[1]  Baining Guo,et al.  Simulation and control of skeleton-driven soft body characters , 2013, ACM Trans. Graph..

[2]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[3]  Aaron Hertzmann,et al.  Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.

[4]  Nancy S. Pollard,et al.  Responsive characters from motion fragments , 2007, SIGGRAPH 2007.

[5]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[6]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[7]  Michael Neff,et al.  State of the Art in Hand and Finger Modeling and Animation , 2015, Comput. Graph. Forum.

[8]  Sergey Levine,et al.  Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.

[9]  Zoran Popovic,et al.  Optimal gait and form for animal locomotion , 2009, ACM Trans. Graph..

[10]  Zoran Popovic,et al.  Contact-aware nonlinear control of dynamic characters , 2009, ACM Trans. Graph..

[11]  C. Karen Liu,et al.  Articulated swimming creatures , 2011, ACM Trans. Graph..

[12]  C. Karen Liu,et al.  Composition of complex optimal multi-character motions , 2006, SCA '06.

[13]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[14]  Raffaello D'Andrea,et al.  Design and Analysis of a Blind Juggling Robot , 2012, IEEE Transactions on Robotics.

[15]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[16]  M. van de Panne,et al.  Generalized biped walking control , 2010, ACM Trans. Graph..

[17]  David C. Brogan,et al.  Animating human athletics , 1995, SIGGRAPH.

[18]  C. Karen Liu,et al.  Interactive synthesis of human-object interaction , 2009, SCA '09.

[19]  Glen Berseth,et al.  DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning , 2017, ACM Trans. Graph..

[20]  Martin Buss,et al.  Robot basketball: A comparison of ball dribbling with visual and force/torque feedback , 2009, 2009 IEEE International Conference on Robotics and Automation.

[21]  Jinxiang Chai,et al.  Robust realtime physics-based motion control for human grasping , 2013, ACM Trans. Graph..

[22]  Yoonsang Lee,et al.  Data-driven biped control , 2010, ACM Trans. Graph..

[23]  Libin Liu,et al.  Guided Learning of Control Graphs for Physics-Based Characters , 2016, ACM Trans. Graph..

[24]  C. Karen Liu,et al.  Learning bicycle stunts , 2014, ACM Trans. Graph..

[25]  C. Karen Liu,et al.  Synthesis of detailed hand manipulations using contact sampling , 2012, ACM Trans. Graph..

[26]  C. Karen Liu,et al.  Dextrous manipulation from a grasping pose , 2009, ACM Trans. Graph..

[27]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[28]  Peter Stone,et al.  Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.

[29]  David J. Fleet,et al.  Optimizing walking controllers for uncertain inputs and environments , 2010, ACM Trans. Graph..

[30]  Victor B. Zordan,et al.  Physically based grasping control from example , 2005, SCA '05.

[31]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[32]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[33]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[34]  Zoran Popovic,et al.  Contact-invariant optimization for hand manipulation , 2012, SCA '12.

[35]  Martin de Lasa,et al.  Robust physics-based locomotion using low-dimensional planning , 2010, ACM Trans. Graph..

[36]  Libin Liu,et al.  Learning reduced-order feedback policies for motion skills , 2015, Symposium on Computer Animation.

[37]  Victor B. Zordan,et al.  Control of Rotational Dynamics for Ground and Aerial Behavior , 2014, IEEE Transactions on Visualization and Computer Graphics.

[38]  Sheldon Andrews,et al.  Goal directed multi-finger manipulation: Control policies and analysis , 2013, Comput. Graph..

[39]  Victor B. Zordan,et al.  Momentum control for balance , 2009, ACM Trans. Graph..

[40]  Zoran Popovic,et al.  Interactive Control of Diverse Complex Characters with Neural Networks , 2015, NIPS.

[41]  Dinesh K. Pai,et al.  Interaction capture and synthesis , 2005, ACM Trans. Graph..

[42]  Victor Uc Cetina,et al.  Reinforcement learning in continuous state and action spaces , 2009 .

[43]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[44]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[45]  Dirk Wollherr,et al.  Ball dribbling with an underactuated continuous-time control phase: Theory & experiments , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[46]  Sehoon Ha,et al.  Falling and landing motion control for character animation , 2012, ACM Trans. Graph..

[47]  C. Karen Liu,et al.  Synthesis of concurrent object manipulation tasks , 2012, ACM Trans. Graph..

[48]  Alin Albu-Schäffer,et al.  Exploiting elastic energy storage for cyclic manipulation: Modeling, stability, and observations for dribbling , 2011, IEEE Conference on Decision and Control and European Control Conference.

[49]  Philippe Beaudoin,et al.  Robust task-based control policies for physics-based characters , 2009, ACM Trans. Graph..

[50]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[51]  Zoran Popovic,et al.  Motion fields for interactive character locomotion , 2010, CACM.

[52]  KangKang Yin,et al.  SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..