Factored Markov Decision Processes

EXAMPLE.– In the case of the car to be maintained, the number of possible states of a car can be huge. For instance, each part of the car can have its one wearout state. The idea of factored representations is that some part of this huge state do not depend on each other and that this structure can be exploited to derive a more compact representation of the global state and obtain more efficiently an optimal policy. For instance, changing the oil in the car should have no effect on the breaks, thus one does not need to care about the state of the breaks to determine an optimal oil changing policy.

[1]  R. Bellman,et al.  Polynomial approximation—a new computational technique in dynamic programming: Allocation processes , 1963 .

[2]  András Lörincz,et al.  Factored Value Iteration Converges , 2008, Acta Cybern..

[3]  Craig Boutilier,et al.  The Frame Problem and Bayesian Network Action Representation , 1996, Canadian Conference on AI.

[4]  Michael L. Littman,et al.  Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.

[5]  A. S. Manne Linear Programming and Sequential Decisions , 1960 .

[6]  Paolo Liberatore,et al.  The size of MDP factored policies , 2002, AAAI/IAAI.

[7]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[8]  Jesse Hoey,et al.  SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[9]  D. Koller,et al.  Planning under uncertainty in complex structured environments , 2003 .

[10]  Andrew G. Barto,et al.  Causal Graph Based Decomposition of Factored MDPs , 2006, J. Mach. Learn. Res..

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[13]  Michael Kearns,et al.  Efficient Reinforcement Learning in Factored MDPs , 1999, IJCAI.

[14]  David Poole,et al.  The Independent Choice Logic for Modelling Multiple Agents Under Uncertainty , 1997, Artif. Intell..

[15]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[16]  Craig Boutilier,et al.  Optimal and Approximate Stochastic Planning using Decision Diagrams , 2000 .

[17]  R. Dreisbach,et al.  STANFORD UNIVERSITY. , 1914, Science.

[18]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[19]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[20]  Daphne Koller,et al.  Policy Iteration for Factored MDPs , 2000, UAI.

[21]  Shobha Venkataraman,et al.  Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..

[22]  Eric Allender,et al.  Complexity of finite-horizon Markov decision process problems , 2000, JACM.

[23]  R. Lathe Phd by thesis , 1988, Nature.

[24]  Enrico Macii,et al.  Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[25]  Jesse Hoey,et al.  APRICODD: Approximate Policy Construction Using Decision Diagrams , 2000, NIPS.

[26]  Benjamin Van Roy,et al.  The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[27]  Carlos Guestrin,et al.  Max-norm Projections for Factored MDPs , 2001, IJCAI.

[28]  P. Schweitzer,et al.  Generalized polynomial approximations in Markovian decision processes , 1985 .

[29]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[30]  P. Poupart Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[31]  Olivier Sigaud,et al.  Learning the structure of Factored Markov Decision Processes in reinforcement learning problems , 2006, ICML.

[32]  Daphne Koller,et al.  Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.