论文信息 - Learning predictive state representations in dynamical systems without reset

Learning predictive state representations in dynamical systems without reset

Predictive state representations (PSRs) are a recently-developed way to model discrete-time, controlled dynamical systems. We present and describe two algorithms for learning a PSR model: a Monte Carlo algorithm and a temporal difference (TD) algorithm. Both of these algorithms can learn models for systems without requiring a reset action as was needed by the previously available general PSR-model learning algorithm. We present empirical results that compare our two algorithms and also compare their performance with that of existing algorithms, including an EM algorithm for learning POMDP models.

Michael R. James | Satinder P. Singh | Britton Wolfe | Satinder Singh | B. Wolfe

[1] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.

[2] H. Jaeger. Discrete-time, discrete-valued observable operator models: a tutorial , 2003 .

[3] Satinder P. Singh,et al. A Nonlinear Predictive State Representation , 2003, NIPS.

[4] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.

[5] Michael R. James,et al. Learning and discovery of predictive state representations in dynamical systems with reset , 2004, ICML.

[6] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[7] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.

[8] Sebastian Thrun,et al. Learning low dimensional predictive representations , 2004, ICML.

[9] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.

[10] K. Aberer,et al. German National Research Center for Information Technology , 2007 .