Joint Learning of Linear Time-Invariant Dynamical Systems

Learning the parameters of a linear time-invariant dynamical system (LTIDS) is a problem of current interest. In many applications, one is interested in jointly learning the parameters of multiple related LTIDS, which remains unexplored to date. To that end, we develop a joint estimator for learning the transition matrices of LTIDS that share common basis matrices. Further, we establish finite-time error bounds that depend on the underlying sample size, dimension, number of tasks, and spectral properties of the transition matrices. The results are obtained under mild regularity assumptions and showcase the gains from pooling information across LTIDS, in comparison to learning each system separately. We also study the impact of misspecifying the joint structure of the transition matrices and show that the established results are robust in the presence of moderate misspecifications.

[1]  Prateek Jain,et al.  Non-convex Optimization for Machine Learning , 2017, Found. Trends Mach. Learn..

[2]  T. Lai,et al.  Self-Normalized Processes: Limit Theory and Statistical Applications , 2001 .

[3]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[4]  Michael I. Jordan,et al.  Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification , 2018, COLT.

[5]  Massimiliano Pontil,et al.  The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..

[6]  G. Michailidis,et al.  Joint estimation of multiple network Granger causal models , 2019, Econometrics and Statistics.

[7]  Ambuj Tewari,et al.  Input perturbations for adaptive control and learning , 2018, Autom..

[8]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[9]  A. Skripnikov,et al.  Regularized joint estimation of related vector autoregressive models , 2019, Comput. Stat. Data Anal..

[10]  A. Seth,et al.  Granger Causality Analysis in Neuroscience and Neuroimaging , 2015, The Journal of Neuroscience.

[11]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[12]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[13]  N. I. Miridakis,et al.  Linear Estimation , 2018, Digital and Statistical Signal Processing.

[14]  Andreas Maurer,et al.  Bounds for Linear Multi-Task Learning , 2006, J. Mach. Learn. Res..

[15]  Yi Zheng,et al.  No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[16]  K. Juselius,et al.  High Inflation, Hyperinflation and Explosive Roots. The Case of Yugoslavia. , 2002 .

[17]  Michael I. Jordan,et al.  Provable Meta-Learning of Linear Representations , 2020, ICML.

[18]  T. Lai,et al.  Asymptotic properties of general autoregressive models and strong consistency of least-squares estimates of their parameters , 1983 .

[19]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[20]  N. Chan,et al.  Asymptotic theory of least squares estimators for nearly unstable processes under strong dependence , 2007, 0711.3589.

[21]  H. Akaike A new look at the statistical model identification , 1974 .

[22]  Ali Shojaie,et al.  Network granger causality with inherent grouping structure , 2012, J. Mach. Learn. Res..

[23]  Alexander Rakhlin,et al.  Near optimal finite time identification of arbitrary linear dynamical systems , 2018, ICML.

[24]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[25]  Simon S. Du,et al.  On the Power of Multitask Representation Learning in Linear MDP , 2021, ArXiv.

[26]  Sham M. Kakade,et al.  Few-Shot Learning via Learning the Representation, Provably , 2020, ICLR.

[27]  Ambuj Tewari,et al.  Finite Time Identification in Unstable Linear Systems , 2017, Autom..

[28]  João Ricardo Sato,et al.  Modeling gene expression regulatory networks with the sparse vector autoregressive model , 2007, BMC Systems Biology.

[29]  Xiaoyu Chen,et al.  Near-optimal Representation Learning for Linear Bandits and Linear RL , 2021, ICML.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Mark W. Watson,et al.  Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics , 2016 .

[32]  Pierre Alquier,et al.  Regret Bounds for Lifelong Learning , 2016, AISTATS.