论文信息 - Purely Epistemic Markov Decision Processes

Purely Epistemic Markov Decision Processes

Planning under uncertainty involves two distinct sources of uncertainty: uncertainty about the effects of actions and uncertainty about the current state of the world. The most widely developed model that deals with both sources of uncertainty is that of Partially Observable Markov Decision Processes (POMDPs). Simplifying POMDPs by getting rid of the second source of uncertainty leads to the well-known framework of fully observable MDPs. Getting rid of the first source of uncertainty leads to a less widely studied framework, namely, decision processes where actions cannot change the state of the world and are only intended to bring some information about the (static) state of the world. Such "purely epistemic" processes are very relevant, since many practical problems (such as diagnosis, database querying, or preference elicitation) fall into this class. However, it is not known whether this specific restriction of POMDP is computationally simpler than POMDPs. In this paper we establish several complexity results for purely epistemic MDPs (EMDPs). We first show that short-horizon policy existence in EMDPs is PSPACE-complete. Then we focus on the specific case of EMDPs with reliable observations and show that in this case, policy existence is "only" NP-complete; however, we show that this problem cannot be approximated with a bounded performance ratio by a polynomial-time algorithm.

[1] Raymond E. Miller,et al. Complexity of Computer Computations , 1972 .

[2] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[3] John E. Hopcroft,et al. Complexity of Computer Computations , 1974, IFIP Congress.

[4] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[5] Carsten Lund,et al. On the hardness of approximating minimization problems , 1994, JACM.

[6] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .

[7] Giorgio Gambosi,et al. Complexity and approximation: combinatorial optimization problems and their approximability properties , 1999 .

[8] Eric Allender,et al. Complexity of finite-horizon Markov decision process problems , 2000, JACM.

[9] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.

[10] Vincent Conitzer,et al. Definition and Complexity of Some Basic Metareasoning Problems , 2003, IJCAI.

[11] Jussi Rintanen,et al. Complexity of Planning with Partial Observability , 2004, ICAPS.