Visualizing Movement Control Optimization Landscapes

A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control optimization landscapes; this yields insights into why control optimization is hard and why common practices like early termination and spline-based action parameterizations make optimization easier. For example, our experiments show how trajectory optimization can become increasingly ill-conditioned with longer trajectories, but parameterizing control as partial target states-e.g., target angles converted to torques using a PD-controller-can act as an efficient preconditioner. Both our visualizations and quantitative empirical data also indicate that neural network policy optimization scales better than trajectory optimization for long planning horizons. Our work advances the understanding of movement optimization and our visualizations should also provide value in educational use.

[1]  Joe Marks,et al.  Spacetime constraints revisited , 1993, SIGGRAPH.

[2]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[3]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[4]  Nicholay Topin,et al.  Exploring loss function topology with cyclical learning rates , 2017, ArXiv.

[5]  Zoran Popovic,et al.  Optimal gait and form for animal locomotion , 2009, ACM Trans. Graph..

[6]  Nicolas Gaud,et al.  A Review and Taxonomy of Interactive Optimization Methods in Operations Research , 2015, ACM Trans. Interact. Intell. Syst..

[7]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[8]  Kyoungmin Lee,et al.  Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..

[9]  C. Karen Liu,et al.  Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[10]  Jaakko Lehtinen,et al.  Online motion synthesis using sequential Monte Carlo , 2014, ACM Trans. Graph..

[11]  Emanuel Todorov,et al.  Combining the benefits of function approximation and trajectory optimization , 2014, Robotics: Science and Systems.

[12]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[13]  Oriol Vinyals,et al.  Qualitatively characterizing neural network optimization problems , 2014, ICLR.

[14]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[15]  Andrew P. Witkin,et al.  Spacetime constraints , 1988, SIGGRAPH.

[16]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, ACM Trans. Graph..

[17]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[18]  Michael F. Cohen,et al.  Interactive spacetime control for animation , 1992, SIGGRAPH.

[19]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[20]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[21]  Nikolaus Hansen,et al.  Evaluating the CMA Evolution Strategy on Multimodal Test Functions , 2004, PPSN.

[22]  Jehee Lee,et al.  Simulating biped behaviors from human motion data , 2007, SIGGRAPH 2007.

[23]  Hans-Georg Beyer,et al.  Large Scale Black-Box Optimization by Limited-Memory Matrix Adaptation , 2019, IEEE Transactions on Evolutionary Computation.

[24]  Julian Togelius,et al.  Predictive Physics Simulation in Game Mechanics , 2017, CHI PLAY.

[25]  Emanuel Todorov,et al.  Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.

[26]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[27]  M. V. D. Panne,et al.  Sampling-based contact-rich motion control , 2010, ACM Trans. Graph..

[28]  Aaron Hertzmann,et al.  Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.

[29]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[30]  Emanuel Todorov,et al.  General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.

[31]  Nancy S. Pollard,et al.  Efficient synthesis of physically valid human motion , 2003, ACM Trans. Graph..

[32]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[33]  Perttu Hämäläinen,et al.  Augmenting sampling based controllers with machine learning , 2017, Symposium on Computer Animation.

[34]  Hao Li,et al.  Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.

[35]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Russell N. Carney,et al.  Pictorial Illustrations Still Improve Students' Learning from Text , 2002 .

[37]  Balaraman Ravindran,et al.  EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[38]  Kourosh Naderi,et al.  Discovering and synthesizing humanoid climbing movements , 2017, ACM Trans. Graph..

[39]  Daniel Holden,et al.  DReCon , 2019, ACM Trans. Graph..

[40]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[41]  Perttu Hämäläinen,et al.  Continuous Control Monte Carlo Tree Search Informed by Multiple Experts , 2019, IEEE Transactions on Visualization and Computer Graphics.

[42]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[43]  Razvan Pascanu,et al.  Sharp Minima Can Generalize For Deep Nets , 2017, ICML.

[44]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[45]  Christopher Vyn Jones,et al.  Visualization and Optimization , 1997 .