An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems

Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems—for example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However, they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many features per example). In turn, this restriction limits their ability to learn accurate models of complex systems. To overcome these limitations, we propose a new online spectral algorithm, which uses tricks such as incremental Singular Value Decomposition (SVD) and random projections to scale to much larger data sets and more complex systems than previous methods. We demonstrate the new method on an inertial measurement prediction task and a high-bandwidth video mapping task and we illustrate desirable behaviors such as "closing the loop," where the latent state representation changes suddenly as the learner recognizes that it has returned to a previously known place.

[1]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[4]  Herbert Jaeger,et al.  Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.

[5]  Stefano Soatto,et al.  Dynamic Data Factorization , 2001 .

[6]  Richard S. Sutton,et al.  Predictive Representations of State , 2001, NIPS.

[7]  Michael R. James,et al.  Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[8]  Sebastian Thrun,et al.  Learning low dimensional predictive representations , 2004, ICML.

[9]  片山 徹 Subspace methods for system identification , 2005 .

[10]  Michael H. Bowling,et al.  Learning predictive state representations using non-blind policies , 2006, ICML '06.

[11]  M. Brand,et al.  Fast low-rank modifications of the thin singular value decomposition , 2006 .

[12]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[13]  Sham M. Kakade,et al.  A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..

[14]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[15]  Byron Boots,et al.  Reduced-Rank Hidden Markov Models , 2009, AISTATS.

[16]  Le Song,et al.  Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[17]  Byron Boots,et al.  Predictive State Temporal Difference Learning , 2010, NIPS.

[18]  Bart De Moor,et al.  Subspace Identification for Linear Systems: Theory ― Implementation ― Applications , 2011 .