Multi-view Matrix Factorization for Linear Dynamical System Estimation

We consider maximum likelihood estimation of linear dynamical systems with generalized-linear observation models. Maximum likelihood is typically considered to be hard in this setting since latent states and transition parameters must be inferred jointly. Given that expectation-maximization does not scale and is prone to local minima, moment-matching approaches from the subspace identification literature have become standard, despite known statistical efficiency issues. In this paper, we instead reconsider likelihood maximization and develop an optimization based strategy for recovering the latent states and transition parameters. Key to the approach is a two-view reformulation of maximum likelihood estimation for linear dynamical systems that enables the use of global optimization algorithms for matrix factorization. We show that the proposed estimation strategy outperforms widely-used identification algorithms such as subspace identification methods, both in terms of accuracy and runtime.

[1]  H. Cramér Mathematical methods of statistics , 1947 .

[2]  Karl Johan Åström,et al.  Maximum likelihood and prediction error methods , 1979, Autom..

[3]  M. Moonen,et al.  A subspace algorithm for balanced state space system identification , 1993, IEEE Trans. Autom. Control..

[4]  Bart De Moor,et al.  N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems , 1994, Autom..

[5]  Mats Viberg,et al.  Subspace-based methods for the identification of linear time-invariant systems , 1995, Autom..

[6]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[7]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[8]  Lennart Ljung,et al.  System identification (2nd ed.): theory for the user , 1999 .

[9]  Tohru Katayama,et al.  Subspace Methods for System Identification , 2005 .

[10]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12]  Byron Boots,et al.  A Constraint Generation Approach to Learning Stable Linear Dynamical Systems , 2007, NIPS.

[13]  Jean Ponce,et al.  Convex Sparse Matrix Factorizations , 2008, ArXiv.

[14]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[15]  Sofia Andersson,et al.  Subspace estimation and prediction methods for hidden Markov models , 2009, 0907.4418.

[16]  A. Gelfand,et al.  Handbook of spatial statistics , 2010 .

[17]  Le Song,et al.  Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[18]  Martha White,et al.  Convex Multi-view Subspace Learning , 2012, NIPS.

[19]  Maneesh Sahani,et al.  Learning stable, regularised latent models of neural population dynamics , 2012, Network.

[20]  Yaoliang Yu,et al.  Accelerated Training for Matrix-norm Regularization: A Boosting Approach , 2012, NIPS.

[21]  Dean P. Foster,et al.  Spectral dimensionality reduction for HMMs , 2012, ArXiv.

[22]  Maneesh Sahani,et al.  Spectral learning of linear dynamics from generalised-linear observations with application to neural population data , 2012, NIPS.

[23]  Byron Boots,et al.  Two Manifold Problems with Applications to Nonlinear System Identification , 2012, ICML.

[24]  René Vidal,et al.  Structured Low-Rank Matrix Factorization: Optimality, Algorithm, and Applications to Image Processing , 2014, ICML.

[25]  Han Zhao,et al.  A Sober Look at Spectral Learning , 2014, ArXiv.

[26]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[27]  Inderjit S. Dhillon,et al.  High-dimensional Time Series Prediction with Missing Values , 2015, 1509.08333.

[28]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[29]  Lars Buesing,et al.  Estimating state and Parameters in state space Models of Spike trains , 2015 .

[30]  Martha White,et al.  Optimal Estimation of Multivariate ARMA Models , 2015, AAAI.

[31]  Yaoliang Yu,et al.  Generalized Conditional Gradient for Sparse Estimation , 2014, J. Mach. Learn. Res..