Improving tracking performance by learning from past data

The main theme of this thesis is the development of machine learning algorithms for highperformance trajectory tracking. We consider dynamic systems that are required to precisely follow predefined trajectories. The goal of this research is to explore how past data (for example measurements from previous executions) can be used to improve a system’s tracking performance. A typical means of imposing a desired behavior on a dynamic system is feedback control. In such a setup, the motion of the system is guided by an external reference signal and the influence of noise and unexpected disturbances is reduced by feeding back the measured system output. The design of feedback control systems is often based on a mathematical model of the underlying system. The performance of such control schemes is limited by the accuracy of the dynamics model and the causality of the control action that is compensating for disturbances only as they occur. We address these limitations by proposing a data-based control approach that is able to store and interpret information from past experiments, and infer the correct control actions for future performances. This research is motivated by recent computational advances, which provide enormous possibilities for storing, processing and evaluating large amounts of data. We aim to exploit these new possibilities with three main contributions: First, we present an algorithm that exploits data from a repeated operation in order to learn to precisely follow a predefined trajectory. We adapt the feed-forward reference signal to the system with the goal of achieving high tracking performance – even under the presence of model errors and other recurring disturbances. The approach is based on a coarse model of the system dynamics and uses measurements from past executions to optimize the tracking performance. We combine traditional optimal filtering methods with state-of-the-art optimization techniques in order to obtain an effective and computationally efficient learning strategy. The proposed approach falls into the area of iterative learning control. Novel features of our approach are the direct treatment of input and state constraints when updating the feed-forward reference, an identification routine that extracts the required system model from a numerical simulation, and a termination condition that stops an execution early if the deviation from the nominal trajectory exceeds a given bound. The latter allows for a safe learning that gradually extends the time horizon of the trajectory. These new features are particularly relevant when we apply the algorithm to highly maneuverable quadrotor vehicles in the ETH Flying Machine Arena. We aim to exploit their full dynamic potential and to improve on time-optimized trajectories. The learning scheme has proven to be effective both when directly learning the thrust and rotational rate inputs sent to the quadrocopter, and when building the learning scheme on

[1]  Gerd Hirzinger,et al.  Energy-efficient Autonomous Four-rotor Flying Robot Controlled at 1 kHz , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[2]  Angela Scḧollig,et al.  A Platform for Dance Performances with Multiple Quadrocopters , 2010 .

[3]  Sergei Lupashin,et al.  Feasiblity of motion primitives for choreographed quadrocopter flight , 2011, Proceedings of the 2011 American Control Conference.

[4]  Deyuan Meng,et al.  Iterative learning approaches to design finite-time consensus protocols for multi-agent systems , 2012, Syst. Control. Lett..

[5]  Kevin L. Moore,et al.  Multi-agent coordination by iterative learning control: Centralized and decentralized strategies , 2011, 2011 IEEE International Symposium on Intelligent Control.

[6]  Raffaello D'Andrea,et al.  Feed-forward parameter identification for precise periodic quadrocopter motions , 2012, 2012 American Control Conference (ACC).

[7]  Andrew G. Alleyne,et al.  Cross-coupled iterative learning control of systems with dissimilar dynamics: design and implementation , 2011, Int. J. Control.

[8]  Sergei Lupashin,et al.  Synchronizing the motion of a quadrocopter to music , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9]  Raffaello D'Andrea,et al.  Quadrocopter Trajectory Generation and Control , 2011 .

[10]  Kevin L. Moore,et al.  Trajectory‐keeping in satellite formation flying via robust periodic learning control , 2010 .

[11]  Vijay Kumar,et al.  The GRASP Multiple Micro-UAV Testbed , 2010, IEEE Robotics & Automation Magazine.

[12]  A.G. Alleyne,et al.  A survey of iterative learning control , 2006, IEEE Control Systems.

[13]  A. Isidori Nonlinear Control Systems , 1985 .

[14]  P. Hughes Spacecraft Attitude Dynamics , 1986 .

[15]  B. Bethke,et al.  Real-time indoor autonomous vehicle test environment , 2008, IEEE Control Systems.