论文信息 - Notions of State Equivalence under Partial Observability

Notions of State Equivalence under Partial Observability

We explore equivalence relations between states in Markov Decision Processes and Partially Observable Markov Decision Processes. We focus on two different equivalence notions: bisimulation (Givan et al, 2003) and a notion of trace equivalence, under which states are considered equivalent roughly if they generate the same conditional probability distributions over observation sequences (where the conditioning is on action sequences). We show that the relationship between these two equivalence notions changes depending on the amount and nature of the partial observability. We also present an alternate characterization of bisimulation based on trajectory equivalence.

Doina Precup | P. S. Castro | P. Panangaden

[1] C. A. R. Hoare,et al. Communicating sequential processes , 1978, CACM.

[2] Robin Milner,et al. A Calculus of Communicating Systems , 1980, Lecture Notes in Computer Science.

[3] Kim G. Larsen,et al. Bisimulation through Probabilistic Testing , 1991, Inf. Comput..

[4] G. Winskel. The formal semantics of programming languages , 1993 .

[5] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[6] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[7] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.

[8] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..

[9] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.

[10] Joelle Pineau,et al. Tractable planning under uncertainty: exploiting structure , 2004 .

[11] Doina Precup,et al. Bounding Performance Loss in Approximate MDP Homomorphisms , 2008, NIPS.