论文信息 - Hilbert Space Embeddings of Predictive State Representations - 字舞流文

Hilbert Space Embeddings of Predictive State Representations

Predictive State Representations (PSRs) are an expressive class of models for controlled stochastic processes. PSRs represent state as a set of predictions of future observable events. Because PSRs are defined entirely in terms of observable data, statistically consistent estimates of PSR parameters can be learned efficiently by manipulating moments of observed training data. Most learning algorithms for PSRs have assumed that actions and observations are finite with low cardinality. In this paper, we generalize PSRs to infinite sets of observations and actions, using the recent concept of Hilbert space embed-dings of distributions. The essence is to represent the state as one or more nonparametric conditional embedding operators in a Reproducing Kernel Hilbert Space (RKHS) and leverage recent work in kernel methods to estimate, predict, and update the representation. We show that these Hilbert space embeddings of PSRs are able to gracefully handle continuous actions and observations, and that our learned models outperform competing system identification algorithms on several prediction benchmarks.

Byron Boots | Arthur Gretton | Geoffrey J. Gordon | Byron Boots | A. Gretton

[1] C. Baker. Joint measures and cross-covariance operators , 1973 .

[2] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3] Yoshua Bengio,et al. An Input Output HMM Architecture , 1994, NIPS.

[4] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .

[5] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.

[6] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[7] J. Suykens,et al. Least Squares Support Vector Machines for Kernel CCA in Nonlinear State-Space Identification , 2004 .

[8] Yoshinobu Kawahara,et al. A Kernel Subspace Method by Stochastic Realization for Learning Nonlinear Dynamical Systems , 2006, NIPS.

[9] Michael H. Bowling,et al. Learning predictive state representations using non-blind policies , 2006, ICML '06.

[10] Le Song,et al. A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[11] Byron Boots,et al. A Constraint Generation Approach to Learning Stable Linear Dynamical Systems , 2007, NIPS.

[12] Satinder P. Singh,et al. Exponential Family Predictive Representations of State , 2007, NIPS.

[13] Satinder P. Singh,et al. On discovery and learning of models with predictive representations of state for agents with continuous actions and observations , 2007, AAMAS '07.

[14] Bernhard Schölkopf,et al. Injective Hilbert Space Embeddings of Probability Measures , 2008, COLT.

[15] Alexander J. Smola,et al. Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[16] Le Song,et al. Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[17] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[18] Dieter Fox,et al. Learning GP-BayesFilters via Gaussian process latent variable models , 2009, Auton. Robots.

[19] Le Song,et al. Kernel Bayes' Rule , 2010, NIPS.

[20] Byron Boots,et al. An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems , 2011, AAAI.

[21] Guy Lever,et al. Conditional mean embeddings as regressors , 2012, ICML.

[22] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.

[23] Kenji Fukumizu,et al. Hilbert Space Embeddings of POMDPs , 2012, UAI.

[24] Byron Boots,et al. Two Manifold Problems with Applications to Nonlinear System Identification , 2012, ICML.