论文信息 - Efficient Learning of a Linear Dynamical System with Stability Guarantees

Efficient Learning of a Linear Dynamical System with Stability Guarantees

We propose a principled method for projecting an arbitrary square matrix to the nonconvex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly efficiently by solving a standard linear quadratic regulator problem. The proposed approach allows us to learn the system matrix of a stable linear dynamical system from a single trajectory of correlated state observations. The resulting estimator is guaranteed to be stable and offers explicit statistical bounds on the estimation error.

Daniel Kuhn | Tobias Sutter | Wouter Jongeneel

[1] J. Zico Kolter,et al. Learning Stable Deep Dynamics Models , 2020, NeurIPS.

[2] Tohru Katayama,et al. STOCHASTIC SUBSPACE IDENTIFICATION GUARANTEEING STABILITY AND MINIMUM PHASE , 2005 .

[3] Miao Yu,et al. Moderate deviation principle for autoregressive processes , 2009, J. Multivar. Anal..

[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[5] S. Varadhan,et al. Large deviations , 2019, Graduate Studies in Mathematics.

[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[7] Tyler Summers,et al. Robust Linear Quadratic Regulator: Exact Tractable Reformulation , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[8] Thomas B. Schön,et al. Maximum likelihood identification of stable linear dynamical systems , 2018, Autom..

[9] Nicola Guglielmi,et al. Matrix Stabilization Using Differential Equations , 2017, SIAM J. Numer. Anal..

[10] Dennis S. Bernstein,et al. Subspace identification with guaranteed stability using constrained optimization , 2003, IEEE Trans. Autom. Control..

[11] Orest Xherija,et al. Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control , 2020, NeurIPS.

[12] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[13] Y. Nesterov,et al. Computing Closest Stable Nonnegative Matrix , 2020, SIAM J. Matrix Anal. Appl..

[14] Alexander Rakhlin,et al. Near optimal finite time identification of arbitrary linear dynamical systems , 2018, ICML.

[15] Nicola Guglielmi,et al. On the Closest Stable/Unstable Nonnegative Matrix and Related Stability Radii , 2018, SIAM J. Matrix Anal. Appl..

[16] Paul Van Dooren,et al. Nearest stable system using successive convex approximations , 2013, Autom..

[17] I. Csiszár. Sanov Property, Generalized $I$-Projection and a Conditional Limit Theorem , 1984 .

[18] Alexandre Proutière,et al. Sample Complexity Lower Bounds for Linear System Identification , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[19] Michael I. Jordan,et al. Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification , 2018, COLT.

[20] Aude Billard,et al. Learning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions , 2014, Robotics Auton. Syst..

[21] R. Sundaram. A First Course in Optimization Theory , 1996 .

[22] Peter Benner,et al. On the numerical solution of large-scale sparse discrete-time Riccati equations , 2011, Adv. Comput. Math..

[23] Punit Sharma,et al. On approximating the nearest Ω‐stable matrix , 2019, Numer. Linear Algebra Appl..

[24] Imre Csiszár,et al. Information projections revisited , 2000, IEEE Trans. Inf. Theory.

[25] Daniel Kuhn,et al. A General Framework for Optimal Data-Driven Optimization , 2020, 2010.06606.

[26] N. Higham. MATRIX NEARNESS PROBLEMS AND APPLICATIONS , 1989 .

[27] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[28] T. K. Nguyen. Numerical solution of discrete-time algebraic Riccati equation , 1999 .

[29] Benjamin Recht,et al. A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[30] Alexandre Proutière,et al. From self-tuning regulators to reinforcement learning and back again , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[31] P. Kumar,et al. Adaptive Linear Quadratic Gaussian Control: The Cost-Biased Approach Revisited , 1998 .

[32] Amir Dembo,et al. Large Deviations Techniques and Applications , 1998 .

[33] D. Bertsekas. Reinforcement Learning and Optimal ControlA Selective Overview , 2018 .

[34] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[35] Harold R. Parks,et al. A Primer of Real Analytic Functions , 1992 .

[36] Nicolas Gillis,et al. Approximating the nearest stable discrete-time system , 2018, Linear Algebra and its Applications.

[37] Daniel Kuhn,et al. On Topological Equivalence in Linear Quadratic Optimal Control , 2021, 2021 European Control Conference (ECC).

[38] Daniel Kuhn,et al. From Data to Decisions: Distributionally Robust Optimization is Optimal , 2017, Manag. Sci..

[39] B. Mark. On Self Tuning Regulators , 1972 .

[40] Johan A. K. Suykens,et al. Identification of stable models in subspace identification by using regularization , 2001, IEEE Trans. Autom. Control..

[41] Sandra Hirche,et al. Learning stochastically stable Gaussian process state-space models , 2020, IFAC J. Syst. Control..

[42] Alan Edelman,et al. Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[43] Ali Cinar,et al. Guaranteed stability of recursive multi-input-single-output time series models , 2013, 2013 American Control Conference.

[44] Imre Csiszár,et al. Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[45] E. Lieb. Convex trace functions and the Wigner-Yanase-Dyson conjecture , 1973 .

[46] Leiba Rodman,et al. Algebraic Riccati equations , 1995 .

[47] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[48] A. Laub,et al. Parallel algorithms for algebraic Riccati equations , 1991 .

[49] Julien Worms,et al. Moderate deviations for stable Markov chains and regression models , 1999 .

[50] Byron Boots,et al. A Constraint Generation Approach to Learning Stable Linear Dynamical Systems , 2007, NIPS.

[51] Bart De Moor,et al. Subspace Identification for Linear Systems: Theory ― Implementation ― Applications , 2011 .

[52] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[53] J. Maciejowski. Guaranteed stability with subspace methods , 1995 .

[54] Todd D. Murphey,et al. Learning Data-Driven Stable Koopman Operators , 2020, ArXiv.

[55] Jan Willem Polderman,et al. A note on the structure of two subsets of the parameter space in adaptive control problems , 1986 .