论文信息 - On discovery and learning of models with predictive representations of state for agents with continuous actions and observations

On discovery and learning of models with predictive representations of state for agents with continuous actions and observations

Models of agent-environment interaction that use predictive state representations (PSRs) have mainly focused on the case of discrete observations and actions. The theory of discrete PSRs uses an elegant construct called the system dynamics matrix and derives the notion of predictive state as a sufficient statistic via the rank of the matrix. With continuous observations and actions, such a matrix and its rank no longer exist. In this paper, we show how to define an analogous construct for the continuous case, called the system dynamics distributions, and use information theoretic notions to define a sufficient statistic and thus state. Given this new construct, we use kernel density estimation to learn approximate system dynamics distributions from data, and use information-theoretic tools to derive algorithms for discovery of state and learning of model parameters. We illustrate our new modeling method on two example problems.

Satinder P. Singh | David Wingate | Satinder Singh | D. Wingate

[1] Eric Wiewiora,et al. Learning predictive representations from a history , 2005, ICML.

[2] Vishal Soni,et al. Relational Knowledge with Predictive State Representations , 2007, IJCAI.

[3] Richard S. Sutton,et al. Using Predictive Representations to Improve Generalization in Reinforcement Learning , 2005, IJCAI.

[4] Jagat Narain Kapur,et al. Measures of information and their applications , 1994 .

[5] Deniz Erdoğmuş,et al. Blind source separation using Renyi's mutual information , 2001, IEEE Signal Processing Letters.

[6] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.

[7] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[8] Satinder P. Singh,et al. Predictive Linear-Gaussian Models of Stochastic Dynamical Systems , 2005, UAI.

[9] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.

[10] Kari Torkkola,et al. Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..

[11] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[12] Satinder P. Singh,et al. Mixtures of Predictive Linear Gaussian Models for Nonlinear, Stochastic Dynamical Systems , 2006, AAAI.

[13] Satinder P. Singh,et al. Predictive linear-Gaussian models of controlled stochastic dynamical systems , 2006, ICML.

[14] Michael H. Bowling,et al. Online Discovery and Learning of Predictive State Representations , 2005, NIPS.

[15] Deniz Erdogmus,et al. Information Theoretic Learning , 2005, Encyclopedia of Artificial Intelligence.

[16] Michael R. James,et al. Learning predictive state representations in dynamical systems without reset , 2005, ICML.

[17] Satinder P. Singh,et al. Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems , 2006, ICML.

[18] Michael R. James,et al. Learning and discovery of predictive state representations in dynamical systems with reset , 2004, ICML.