Multi-robot transfer learning: A dynamical system perspective

Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots' dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60–70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.

[1]  Balaraman Ravindran,et al.  Transfer learning across heterogeneous robots with action sequence mapping , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Alberto Isidori,et al.  Nonlinear control systems: an introduction (2nd ed.) , 1989 .

[3]  Angela P. Schoellig,et al.  Safe and robust learning control with Gaussian processes , 2015, 2015 European Control Conference (ECC).

[4]  Benjamin Rosman,et al.  Knowledge transfer for learning robot models via Local Procrustes Analysis , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[5]  Chang Wang,et al.  A General Framework for Manifold Alignment , 2009, AAAI Fall Symposium: Manifold Learning and Its Applications.

[6]  Francis J. Doyle,et al.  Nonlinear systems theory , 1997 .

[7]  Goele Pipeleers,et al.  Initialization of ILC based on a previously learned trajectory , 2012, 2012 American Control Conference (ACC).

[8]  Jan Peters,et al.  Alignment-based transfer learning for robot models , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[9]  Sergei Lupashin,et al.  Feasiblity of motion primitives for choreographed quadrocopter flight , 2011, Proceedings of the 2011 American Control Conference.

[10]  Sergey Levine,et al.  Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Javier Alonso-Mora,et al.  Limited benefit of joint estimation in multi‐agent iterative learning , 2012 .

[12]  Ioannis P. Vlahavas,et al.  Transfer Learning in Multi-Agent Reinforcement Learning Domains , 2011, EWRL.

[13]  Eduardo Sontag,et al.  A notion of input to output stability , 1997, 1997 European Control Conference (ECC).

[14]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Angela P. Schoellig,et al.  Deep neural networks for improved, impromptu trajectory tracking of quadrotors , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Sridhar Mahadevan,et al.  Manifold alignment using Procrustes analysis , 2008, ICML '08.

[17]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[18]  Bruce A. Francis,et al.  An upper bound on the error of alignment-based Transfer Learning between two linear, time-invariant, scalar systems , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Raffaello D'Andrea,et al.  Knowledge transfer for high-performance quadrocopter maneuvers , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Sergey Levine,et al.  Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[21]  Andreas Krause,et al.  Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Angela P. Schoellig,et al.  Robust Constrained Learning-based NMPC enabling reliable mobile robot path tracking , 2016, Int. J. Robotics Res..

[23]  Angela P. Schoellig,et al.  On the construction of safe controllable regions for affine systems with applications to robotics , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[24]  Jung-Min Park,et al.  Independent Joint Learning: A novel task-to-task transfer learning scheme for robot models , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Gerhard Weiss,et al.  Multiagent Learning: Basics, Challenges, and Prospects , 2012, AI Mag..