A Multiobjective LQR Synthesis Approach to Dual Control for Uncertain Plants

This letter proposes a dual control finite horizon LQR synthesis procedure for unknown systems characterized by mean and covariance estimates. The optimized policy comprises time-varying state-feedback and dithering components, and the control problem is framed as a multiobjective synthesis which seeks a balance between exploitation and exploration costs. It is shown that classic experiment design problems can be recast in this framework by replacing the exploitation cost with an information reward. Numerical examples demonstrate the different dual control trade-offs on plants with different properties.

[1]  Robert R. Bitmead,et al.  Persistently exciting model predictive control , 2014 .

[2]  Alexandre Proutière,et al.  From self-tuning regulators to reinforcement learning and back again , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[3]  Thomas B. Schön,et al.  Learning Robust LQ-Controllers Using Application Oriented Exploration , 2020, IEEE Control Systems Letters.

[4]  Ian R. Manchester,et al.  Input design for system identification via convex relaxation , 2010, 49th IEEE Conference on Decision and Control (CDC).

[5]  Lennart Ljung,et al.  Optimal experiment designs with respect to the intended model application , 1986, Autom..

[6]  Vladimír Havlena,et al.  MPC‐based approximate dual controller by information matrix maximization , 2013 .

[7]  Tor Lattimore,et al.  On Explore-Then-Commit strategies , 2016, NIPS.

[8]  Thomas B. Schön,et al.  Robust exploration in linear quadratic reinforcement learning , 2019, NeurIPS.

[9]  K. Åström,et al.  Problems of Identification and Control , 1971 .

[10]  Roy S. Smith,et al.  Structured exploration in the finite horizon linear quadratic dual control problem , 2019, IFAC-PapersOnLine.

[11]  Bo Wahlberg,et al.  Application-Oriented Input Design in System Identification: Optimal Input Design for Control [Applications of Control] , 2017, IEEE Control Systems.

[12]  Graham C. Goodwin,et al.  On the equivalence of least costly and traditional experiment design for control , 2008, Autom..

[13]  Benjamin Recht,et al.  Certainty Equivalence is Efficient for Linear Quadratic Control , 2019, NeurIPS.

[14]  Svante Gunnarsson,et al.  Iterative feedback tuning: theory and applications , 1998 .

[15]  I. Postlethwaite,et al.  Linear Matrix Inequalities in Control , 2007 .

[16]  Arieh Iserles,et al.  On the Foundations of Computational Mathematics , 2001 .

[17]  Csaba Szepesvári,et al.  Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.

[18]  Alexandre S. Bazanella,et al.  Data-Driven LQR Control Design , 2018, IEEE Control Systems Letters.

[19]  M. Kanat Camlibel,et al.  Data Informativity: A New Perspective on Data-Driven Analysis and Control , 2019, IEEE Transactions on Automatic Control.

[20]  Zhi-Quan Luo,et al.  Multivariate Nonnegative Quadratic Mappings , 2003, SIAM J. Optim..

[21]  Xavier Bombois,et al.  Identification and the Information Matrix: How to Get Just Sufficiently Rich? , 2009, IEEE Transactions on Automatic Control.

[22]  Avinatan Hassidim,et al.  Online Linear Quadratic Control , 2018, ICML.