Robot Program Parameter Inference via Differentiable Shadow Program Inversion

Challenging manipulation tasks can be solved effectively by combining individual robot skills, which must be parameterized for the concrete physical environment and task at hand. This is time-consuming and difficult for human programmers, particularly for force-controlled skills. To this end, we present Shadow Program Inversion (SPI), a novel approach to infer optimal skill parameters directly from data. SPI leverages unsupervised learning to train an auxiliary differentiable program representation ("shadow program") and realizes parameter inference via gradient-based model inversion. Our method enables the use of efficient first-order optimizers to infer optimal parameters for originally non-differentiable skills, including many skill variants currently used in production. SPI zero-shot generalizes across task objectives, meaning that shadow programs do not need to be retrained to infer parameters for different task variants. We evaluate our methods on three different robots and skill frameworks in industrial and household scenarios. Code and examples are available at https://innolab.artiminds.com/icra2021.

[1]  Peter Englert,et al.  Learning manipulation skills from a single demonstration , 2018, Int. J. Robotics Res..

[2]  Alan Edelman,et al.  A Differentiable Programming System to Bridge Machine Learning and Scientific Computing , 2019, ArXiv.

[3]  Emre Ugur,et al.  Conditional Neural Movement Primitives , 2019, Robotics: Science and Systems.

[4]  Michael Beetz,et al.  Know Rob 2.0 — A 2nd Generation Knowledge Processing Framework for Cognition-Enabled Robotic Agents , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[6]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[7]  S. Schaal Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics , 2006 .

[8]  Ahmet E. Tekden,et al.  ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing , 2020, CoRL.

[9]  Peter Englert,et al.  Combined Optimization and Reinforcement Learning for Manipulation Skills , 2016, Robotics: Science and Systems.

[10]  You Zhou,et al.  Movement Primitive Learning and Generalization: Using Mixture Density Networks , 2020, IEEE Robotics & Automation Magazine.

[11]  Liqian Feng,et al.  Optimization of robot path planning parameters based on genetic algorithm , 2017, 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR).

[12]  Affan Pervez,et al.  Learning deep movement primitives using convolutional neural networks , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[13]  Marc Brockschmidt,et al.  Differentiable Programs with Neural Libraries , 2016, ICML.

[14]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[15]  Byron Boots,et al.  Differentiable Gaussian Process Motion Planning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Jan Peters,et al.  Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.

[17]  Tim Rocktäschel,et al.  Programming with a Differentiable Forth Interpreter , 2016, ICML.

[18]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[19]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[20]  Dan Klein,et al.  Abstract Syntax Networks for Code Generation and Semantic Parsing , 2017, ACL.

[21]  Leslie Pack Kaelbling,et al.  Hierarchical task and motion planning in the now , 2011, 2011 IEEE International Conference on Robotics and Automation.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Sergey Levine,et al.  Learning Latent Plans from Play , 2019, CoRL.

[25]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[26]  Wolfram Burgard,et al.  Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[28]  Jenq-Neng Hwang,et al.  Iterative inversion of neural networks and its application to adaptive control , 1992, IEEE Trans. Neural Networks.

[29]  Jan Peters,et al.  Local Bayesian Optimization of Motor Skills , 2017, ICML.

[30]  Lihong Li,et al.  Neuro-Symbolic Program Synthesis , 2016, ICLR.

[31]  Marcin Andrychowicz,et al.  Neural Random Access Machines , 2015, ERCIM News.

[32]  Jeremy A. Marvel,et al.  Automated learning for parameter optimization of robotic assembly tasks utilizing genetic algorithms , 2009, 2008 IEEE International Conference on Robotics and Biomimetics.

[33]  Rainer Jäkel,et al.  Learning of Generalized Manipulation Strategies in Service Robotics , 2013 .

[34]  Sergey Levine,et al.  Time Reversal as Self-Supervision , 2018, ArXiv.

[35]  Leslie Pack Kaelbling,et al.  Differentiable Algorithm Networks for Composable Robot Learning , 2019, Robotics: Science and Systems.

[36]  Daniel Urieli,et al.  On optimizing interdependent skills: a case study in simulated 3D humanoid robot soccer , 2011, AAMAS.

[37]  James Martens,et al.  Deep learning via Hessian-free optimization , 2010, ICML.

[38]  Andreas Krause,et al.  Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.

[39]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.