Active Learning for Nonlinear System Identification with Guarantees

While the identification of nonlinear dynamical systems is a fundamental building block of model-based reinforcement learning and feedback control, its sample complexity is only understood for systems that either have discrete states and actions or for systems that can be identified from data generated by i.i.d. random inputs. Nonetheless, many interesting dynamical systems have continuous states and actions and can only be identified through a judicious choice of inputs. Motivated by practical settings, we study a class of nonlinear dynamical systems whose state transitions depend linearly on a known feature embedding of state-action pairs. To estimate such systems in finite time identification methods must explore all directions in feature space. We propose an active learning approach that achieves this by repeating three steps: trajectory planning, trajectory tracking, and re-estimation of the system from all available data. We show that our method estimates nonlinear dynamical systems at a parametric rate, similar to the statistical rate of standard linear regression.

[1]  Karl Johan Åström,et al.  BOOK REVIEW SYSTEM IDENTIFICATION , 1994, Econometric Theory.

[2]  J. Bunch,et al.  Rank-one modification of the symmetric eigenproblem , 1978 .

[3]  C. A. Desoer,et al.  Nonlinear Systems Analysis , 1978 .

[4]  Eduardo Sontag Nonlinear regulation: The piecewise linear approach , 1981 .

[5]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[6]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[7]  J. Tsitsiklis,et al.  The sample complexity of worst-case identification of FIR linear systems , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[8]  Michael Jackson,et al.  Optimal Design of Experiments , 1994 .

[9]  Bernard Delyon,et al.  Nonlinear black-box models in system identification: Mathematical foundations , 1995, Autom..

[10]  Lennart Ljung,et al.  Nonlinear black-box modeling in system identification: a unified overview , 1995, Autom..

[11]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[12]  Steven M. LaValle,et al.  Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[13]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[14]  Bart De Schutter,et al.  Equivalence of hybrid dynamical models , 2001, Autom..

[15]  Erik Weyer,et al.  Finite sample properties of system identification methods , 2002, IEEE Trans. Autom. Control..

[16]  Ben Tse,et al.  Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[17]  Pieter Abbeel,et al.  Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[18]  Alberto Bemporad,et al.  An MPC/hybrid system approach to traction control , 2006, IEEE Transactions on Control Systems Technology.

[19]  Manfred Morari,et al.  Hybrid Model Predictive Control of the Step-Down DC–DC Converter , 2008, IEEE Transactions on Control Systems Technology.

[20]  Thomas P. Hayes,et al.  Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[21]  Sheng Chen,et al.  Model selection approaches for non-linear system identification: a review , 2008, Int. J. Syst. Sci..

[22]  Calin Belta,et al.  Temporal logic control of discrete-time piecewise affine systems , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[23]  Eduardo F. Camacho,et al.  Model Predictive Control techniques for Hybrid Systems , 2009, ADHS.

[24]  John N. Tsitsiklis,et al.  Linearly Parameterized Bandits , 2008, Math. Oper. Res..

[25]  Eduardo F. Camacho,et al.  Model predictive control techniques for hybrid systems , 2010, Annu. Rev. Control..

[26]  Csaba Szepesvári,et al.  Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems , 2011, ArXiv.

[27]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[28]  Xavier Bombois,et al.  Optimal experiment design for open and closed-loop system identification , 2011, Commun. Inf. Syst..

[29]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[30]  J. Andrew Bagnell,et al.  Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.

[31]  Calin Belta,et al.  Temporal Logic Control of Discrete-Time Piecewise Affine Systems , 2012, IEEE Transactions on Automatic Control.

[32]  Siddhartha S. Srinivasa,et al.  CHOMP: Covariant Hamiltonian optimization for motion planning , 2013, Int. J. Robotics Res..

[33]  S. Brunton,et al.  Discovering governing equations from data by sparse identification of nonlinear dynamical systems , 2015, Proceedings of the National Academy of Sciences.

[34]  Alberto Bemporad,et al.  Predictive Control for Linear and Hybrid Systems , 2017 .

[35]  Karan Singh,et al.  Learning Linear Dynamical Systems via Spectral Filtering , 2017, NIPS.

[36]  Antonio Bicchi,et al.  Approximate hybrid model predictive control for multi-contact push recovery in complex environments , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[37]  Weiqiao Han,et al.  Feedback design for multi-contact push recovery via LMI approximation of the Piecewise-Affine Quadratic Regulator , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[38]  Ambuj Tewari,et al.  Finite Time Identification in Unstable Linear Systems , 2017, Autom..

[39]  Michael I. Jordan,et al.  Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification , 2018, COLT.

[40]  Tengyu Ma,et al.  Gradient Descent Learns Linear Dynamical Systems , 2016, J. Mach. Learn. Res..

[41]  Yi Zhang,et al.  Spectral Filtering for General Linear Dynamical Systems , 2018, NeurIPS.

[42]  Samet Oymak,et al.  Stochastic Gradient Descent Learns State Equations with Nonlinear Activations , 2018, COLT.

[43]  Max Simchowitz,et al.  Learning Linear Dynamical Systems with Semi-Parametric Least Squares , 2019, COLT.

[44]  Russ Tedrake,et al.  Sampling-Based Polytopic Trees for Approximate Optimal Control of Piecewise Affine Systems , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[45]  Aryeh Kontorovich,et al.  Minimax Learning of Ergodic Markov Chains , 2018, ALT.

[46]  Alexander Rakhlin,et al.  Near optimal finite time identification of arbitrary linear dynamical systems , 2018, ICML.

[47]  Munther A. Dahleh,et al.  Finite-Time System Identification for Partially Observed LTI Systems of Unknown Order , 2019, ArXiv.

[48]  Lennart Ljung,et al.  Nonlinear System Identification: A User-Oriented Road Map , 2019, IEEE Control Systems.

[49]  Nikolai Matni,et al.  Learning Sparse Dynamical Systems from a Single Sample Trajectory , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[50]  Justin Romberg,et al.  Convex Programming for Estimation in Nonlinear Recurrent Models , 2019, ArXiv.

[51]  George J. Pappas,et al.  Finite Sample Analysis of Stochastic System Identification , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[52]  Samet Oymak,et al.  Non-asymptotic Identification of LTI Systems from a Single Trajectory , 2018, 2019 American Control Conference (ACC).

[53]  Alessandro Chiuso,et al.  System Identification: A Machine Learning Perspective , 2019, Annu. Rev. Control. Robotics Auton. Syst..

[54]  A. Rakhlin,et al.  Data Driven Estimation of Stochastic Switched Linear Systems of Unknown Order , 2019, 1909.04617.

[55]  Munther A. Dahleh,et al.  Nonparametric System identification of Stochastic Switched Linear Systems , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[56]  George J. Pappas,et al.  Sample Complexity of Kalman Filtering for Unknown Systems , 2019, L4DC.

[57]  S. Kakade,et al.  Information Theoretic Regret Bounds for Online Nonlinear Control , 2020, NeurIPS.

[58]  Kevin G. Jamieson,et al.  Active Learning for Identification of Linear Dynamical Systems , 2020, COLT.

[59]  Samet Oymak,et al.  Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems , 2020, J. Mach. Learn. Res..

[60]  Tianshi Chen,et al.  A shift in paradigm for system identification , 2019, Int. J. Control.

[61]  Akshay Krishnamurthy,et al.  Reward-Free Exploration for Reinforcement Learning , 2020, ICML.

[62]  Dylan J. Foster,et al.  Learning nonlinear dynamical systems from a single trajectory , 2020, L4DC.

[63]  M. Fazel,et al.  Finite Sample System Identification: Improved Rates and the Role of Regularization , 2020 .

[64]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[65]  Munther A. Dahleh,et al.  Finite Time LTI System Identification , 2021, J. Mach. Learn. Res..

[66]  Marco Pavone,et al.  Learning stabilizable nonlinear dynamics with contraction-based regularization , 2019, Int. J. Robotics Res..

[67]  Roy S. Smith,et al.  Nonlinear System Identification With Prior Knowledge on the Region of Attraction , 2020, IEEE Control Systems Letters.

[68]  Roy S. Smith,et al.  Convex Nonparametric Formulation for Identification of Gradient Flows , 2020, IEEE Control Systems Letters.