Visualizing Movement Control Optimization Landscapes

A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control optimization landscapes; this yields insights into why control optimization is hard and why common practices like early termination and spline-based action parameterizations make optimization easier. For example, our experiments show how trajectory optimization can become increasingly ill-conditioned with longer trajectories, but parameterizing control as partial target states—e.g., target angles converted to torques using a PD-controller—can act as an efficient preconditioner. Both our visualizations and quantitative empirical data also indicate that neural network policy optimization scales better than trajectory optimization for long planning horizons. Our work advances the understanding of movement optimization and our visualizations should also provide value in educational use.

[1]  Zoran Popovic,et al.  Optimal gait and form for animal locomotion , 2009, ACM Trans. Graph..

[2]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[3]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[4]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[5]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[6]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[7]  Michael F. Cohen,et al.  Interactive spacetime control for animation , 1992, SIGGRAPH.

[8]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[9]  Julian Togelius,et al.  Predictive Physics Simulation in Game Mechanics , 2017, CHI PLAY.

[10]  Nikolaus Hansen,et al.  Evaluating the CMA Evolution Strategy on Multimodal Test Functions , 2004, PPSN.

[11]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[12]  Andrew P. Witkin,et al.  Spacetime constraints , 1988, SIGGRAPH.

[13]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[14]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[15]  Emanuel Todorov,et al.  Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.

[16]  Emanuel Todorov,et al.  General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.

[17]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[18]  Hans-Georg Beyer,et al.  Large Scale Black-Box Optimization by Limited-Memory Matrix Adaptation , 2019, IEEE Transactions on Evolutionary Computation.

[19]  C. Karen Liu,et al.  Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[20]  Hao Li,et al.  Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.

[21]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[22]  Jehee Lee,et al.  Simulating biped behaviors from human motion data , 2007, SIGGRAPH 2007.

[23]  Emanuel Todorov,et al.  Combining the benefits of function approximation and trajectory optimization , 2014, Robotics: Science and Systems.

[24]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, ACM Trans. Graph..

[25]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[26]  Nicholay Topin,et al.  Exploring loss function topology with cyclical learning rates , 2017, ArXiv.

[27]  M. V. D. Panne,et al.  Sampling-based contact-rich motion control , 2010, ACM Trans. Graph..

[28]  Kyoungmin Lee,et al.  Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..

[29]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[30]  Nicolas Gaud,et al.  A Review and Taxonomy of Interactive Optimization Methods in Operations Research , 2015, ACM Trans. Interact. Intell. Syst..

[31]  Kourosh Naderi,et al.  Discovering and synthesizing humanoid climbing movements , 2017, ACM Trans. Graph..

[32]  Russell N. Carney,et al.  Pictorial Illustrations Still Improve Students' Learning from Text , 2002 .

[33]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[34]  Aaron Hertzmann,et al.  Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.

[35]  Christopher Vyn Jones,et al.  Visualization and Optimization , 1997 .

[36]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[37]  Balaraman Ravindran,et al.  EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[38]  Joe Marks,et al.  Spacetime constraints revisited , 1993, SIGGRAPH.

[39]  Perttu Hämäläinen,et al.  Augmenting sampling based controllers with machine learning , 2017, Symposium on Computer Animation.

[40]  Joose Rajamäki,et al.  Continuous Control Monte Carlo Tree Search Informed by Multiple Experts , 2019, IEEE Transactions on Visualization and Computer Graphics.

[41]  Oriol Vinyals,et al.  Qualitatively characterizing neural network optimization problems , 2014, ICLR.

[42]  Razvan Pascanu,et al.  Sharp Minima Can Generalize For Deep Nets , 2017, ICML.

[43]  Nancy S. Pollard,et al.  Efficient synthesis of physically valid human motion , 2003, ACM Trans. Graph..

[44]  Jaakko Lehtinen,et al.  Online motion synthesis using sequential Monte Carlo , 2014, ACM Trans. Graph..