Predictively Defined Representations of State

The concept of state is central to dynamical systems. In any timeseries problem—such as filtering, planning or forecasting—models and algorithms summarize important information from the past into some sort of state variable. In this chapter, we start with a broad examination of the concept of state, with emphasis on the fact that there are many possible representations of state for a given dynamical system, each with different theoretical and computational properties. We then focus on models with predictively defined representations of state that represent state as a set of statistics about the short-term future, as opposed to the classic approach of treating state as a latent, unobservable quantity. In other words, the past is summarized into predictions about the actions and observations in the short-term future, which can be used to make further predictions about the infinite future.While this representational idea applies to any dynamical system problem, it is particularly useful in a model-based RL context, when an agent must learn a representation of state and a model of system dynamics online: because the representation (and hence all of the model’s parameters) are defined using only statistics of observable quantities, their learning algorithms are often straightforward and have attractive theoretical properties. Here, we survey the basic concepts of predictively defined representations of state, important auxiliary constructs (such as the systems dynamics matrix), and theoretical results on their representational power and learnability.

[1]  Richard S. Sutton,et al.  Temporal-Difference Networks , 2004, NIPS.

[2]  Sebastian Thrun,et al.  Learning low dimensional predictive representations , 2004, ICML.

[3]  Doina Precup,et al.  Point-Based Planning for Predictive State Representations , 2008, Canadian Conference on AI.

[4]  Michael R. James,et al.  Combining Memory and Landmarks with Predictive State Representations , 2005, IJCAI.

[5]  Satinder P. Singh,et al.  A Nonlinear Predictive State Representation , 2003, NIPS.

[6]  Richard S. Sutton,et al.  Temporal-Difference Networks with History , 2005, IJCAI.

[7]  Brahim Chaib-draa,et al.  Predictive representations for policy gradient in POMDPs , 2009, ICML '09.

[8]  H. Jaeger Discrete-time, discrete-valued observable operator models: a tutorial , 2003 .

[9]  Michael R. James,et al.  Planning in Models that Combine Memory with Predictive Representations of State , 2005, AAAI.

[10]  Doina Precup,et al.  Model minimization by linear PSR , 2005, IJCAI.

[11]  Michael R. James,et al.  Learning predictive state representations in dynamical systems without reset , 2005, ICML.

[12]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[13]  Rudolph van der Merwe,et al.  The unscented Kalman filter for nonlinear estimation , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[14]  Michael H. Bowling,et al.  Learning predictive state representations using non-blind policies , 2006, ICML '06.

[15]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[16]  Eric Wiewiora,et al.  Learning predictive representations from a history , 2005, ICML.

[17]  Nikos A. Vlassis,et al.  Improving Approximate Value Iteration Using Memories and Predictive State Representations , 2006, AAAI.

[18]  Satinder P. Singh,et al.  On discovery and learning of models with predictive representations of state for agents with continuous actions and observations , 2007, AAMAS '07.

[19]  Satinder P. Singh,et al.  Predictive linear-Gaussian models of controlled stochastic dynamical systems , 2006, ICML.

[20]  Michael H. Bowling,et al.  Online Discovery and Learning of Predictive State Representations , 2005, NIPS.

[21]  Britton Wolfe Valid parameters for predictive state representations , 2010, ISAIM.

[22]  Vishal Soni,et al.  Relational Knowledge with Predictive State Representations , 2007, IJCAI.

[23]  Richard S. Sutton,et al.  Using Predictive Representations to Improve Generalization in Reinforcement Learning , 2005, IJCAI.

[24]  Satinder P. Singh,et al.  Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems , 2006, ICML.

[25]  Michael L. Littman,et al.  Planning with predictive state representations , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[26]  Michael R. James,et al.  Learning and discovery of predictive state representations in dynamical systems with reset , 2004, ICML.

[27]  Peter Stone,et al.  Learning Predictive State Representations , 2003, ICML.

[28]  Daniel Nikovski,et al.  State-aggregation algorithms for learning probabilistic models for robot control , 2002 .

[29]  Satinder Singh Baveja,et al.  Using predictions for planning and modeling in stochastic environments , 2005 .

[30]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2011, Int. J. Robotics Res..

[31]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[32]  Satinder P. Singh,et al.  Predictive Linear-Gaussian Models of Stochastic Dynamical Systems , 2005, UAI.

[33]  Olivier Buffet,et al.  Policy-Gradients for PSRs and POMDPs , 2007, AISTATS.

[34]  Satinder P. Singh,et al.  Exponential Family Predictive Representations of State , 2007, NIPS.

[35]  Ronald L. Rivest,et al.  Diversity-Based Inference of Finite Automata (Extended Abstract) , 1987, FOCS.

[36]  Doina Precup,et al.  Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.

[37]  Richard S. Sutton,et al.  Predictive Representations of State , 2001, NIPS.

[38]  Michael R. James,et al.  Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[39]  Richard S. Sutton,et al.  TD(λ) networks: temporal-difference networks with eligibility traces , 2005, ICML.

[40]  Britton D. Wolfe,et al.  Modeling Dynamical Systems with Structured Predictive State Representations , 2009 .

[41]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[42]  Leslie Pack Kaelbling,et al.  Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Geometrical-Topological Gap , 2002 .

[43]  Vadim Bulitko,et al.  Grounding Abstractions in Predictive State Representations , 2007, IJCAI.

[44]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[45]  Michael R. James,et al.  Approximate predictive state representations , 2008, AAMAS.

[46]  Satinder P. Singh,et al.  Efficiently learning linear-linear exponential family predictive representations of state , 2008, ICML '08.

[47]  Herbert Jaeger,et al.  Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.

[48]  Satinder Singh,et al.  Modeling Multiple-Mode Systems with Predictive State Representations , 2010 .

[49]  L. P. Kaelbling,et al.  Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Topological-Geometrical Gap , 2011, J. Artif. Intell. Res..

[50]  Satinder P. Singh,et al.  Mixtures of Predictive Linear Gaussian Models for Nonlinear, Stochastic Dynamical Systems , 2006, AAAI.