Symbolic Dynamic Programming for First-Order MDPs

We present a dynamic programming approach for the solution of first-order Markov decisions processes. This technique uses an MDP whose dynamics is represented in a variant of the situation calculus allowing for stochastic actions. It produces a logical description of the optimal value function and policy by constructing a set of first-order formulae that minimally partition state space according to distinctions made by the value function and policy. This is achieved through the use of an operation known as decision-theoretic regression. In effect, our algorithm performs value iteration without explicit enumeration of either the state or action spaces of the MDP. This allows problems involving relational fluents and quantification to be solved without requiring explicit state space enumeration or conversion to propositional form.

[1]  J. McCarthy Situations, Actions, and Causal Laws , 1963 .

[2]  Marvin Minsky,et al.  Semantic Information Processing , 1968 .

[3]  Raymond Reiter,et al.  The Frame Problem in the Situation Calculus: A Simple Solution (Sometimes) and a Completeness Result for Goal Regression , 1991, Artificial and Mathematical Theory of Computation.

[4]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[5]  Hector J. Levesque,et al.  Reasoning about Noisy Sensors in the Situation Calculus , 1995, IJCAI.

[6]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[7]  David Poole,et al.  The Independent Choice Logic for Modelling Multiple Agents Under Uncertainty , 1997, Artif. Intell..

[8]  Blai Bonet High-Level Planning and Control with Incomplete Information Using POMDP's , 1998 .

[9]  Raymond Reiter,et al.  Some contributions to the metatheory of the situation calculus , 1999, JACM.

[10]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[11]  Jesse Hoey,et al.  SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[12]  Craig Boutilier,et al.  Decision-Theoretic, High-Level Agent Programming in the Situation Calculus , 2000, AAAI/IAAI.

[13]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[14]  Raymond Reiter,et al.  Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems , 2001 .