Adaptive control of LCL filter with time-varying parameters using reinforcement learning

Two main aspects of a LQ control of a single-phase converter with a LCL filter are studied in this paper. First, the choice of a penalization structure that prevents instability of the closed loop caused by input saturation due to limited voltage of the DC link. We show that all tested variants of the penalization can achieve stable performance of the system. Second, adaptation of the controller gains to a change of parameter of the LCL filter. We compare adaptation rules based on the classical Riccati equation and on the reinforcement learning method of the temporal difference gradient descent. The latter method has lower computational cost and the potential to be more extensible to non-linear systems. We show that the standard approach with a constant learning rate is inefficient and better results can be achievd using adaptive learning rate. Performance of the method with adaptive learning rate is tested in simulation of current tracking after step change of the output inductance.

[1]  Blaha Štěpán,et al.  Anti-windup compensation of LQG for single-phase converter with LCL filter , 2017, 2017 19th European Conference on Power Electronics and Applications (EPE'17 ECCE Europe).

[2]  Nikhil Buduma,et al.  Fundamentals of deep learning , 2017 .

[3]  Huibert Kwakernaak,et al.  Linear Optimal Control Systems , 1972 .

[4]  Sina Zarrabian Temporal difference-based approach for adaptive cascading outages alleviation and stability improvement in bulk power systems , 2017, 2017 IEEE Power & Energy Society General Meeting.

[5]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  J. Svensson,et al.  Control of a voltage-source converter connected to the grid through an LCL-filter-application to active filtering , 1998, PESC 98 Record. 29th Annual IEEE Power Electronics Specialists Conference (Cat. No.98CH36196).

[7]  Sukumar Kamalasadan,et al.  Design and Real-Time Implementation of Optimal Power System Wide-Area System-Centric Controller Based on Temporal Difference Learning , 2016 .

[8]  Jenq-Neng Hwang,et al.  Temporal difference method for multi-step prediction: application to power load forecasting , 1991, Proceedings of the First International Forum on Applications of Neural Networks to Power Systems.

[9]  D. Bernstein,et al.  A chronological bibliography on saturating actuators , 1995 .