Learning and discovery of predictive state representations in dynamical systems with reset

Predictive state representations (PSRs) are a recently proposed way of modeling controlled dynamical systems. PSR-based models use predictions of observable outcomes of tests that could be done on the system as their state representation, and have model parameters that define how the predictive state representation changes over time as actions are taken and observations noted. Learning PSR-based models requires solving two subproblems: 1) discovery of the tests whose predictions constitute state, and 2) learning the model parameters that define the dynamics. So far, there have been no results available on the discovery subproblem while for the learning subproblem an approximate-gradient algorithm has been proposed (Singh et al., 2003) with mixed results (it works on some domains and not on others). In this paper, we provide the first discovery algorithm and a new learning algorithm for linear PSRs for the special class of controlled dynamical systems that have a reset operation. We provide experimental verification of our algorithms. Finally, we also distinguish our work from prior work by Jaeger (2000) on observable operator models (OOMs).