Association rule-based Markov Decision Processes

In this paper we present a novel approach for the fast solution of Markov decision processes based on the concept of association rules. These processes have successfully solved many probabilistic problems such as: process control, decision analysis and economy. But for problems with continuous or high dimensionality domains, high computational complexity arises, because the search space grows exponentially with the number of variables. In order to reduce the complexity in these processes, we propose a new approach to represent, learn and apply the actions that really operate on the current state as a small set of association rules and a new value iteration algorithm based on association rules. Experimental results on a robot path planning task indicate that the solution time and therefore the complexity of the proposed approach are considerably reduced, because they increase linearly when the number of the states increases.

[1]  Blai Bonet,et al.  Qualitative MDPs and POMDPs: An Order-Of-Magnitude Approximation , 2002, UAI.

[2]  T. Dean,et al.  Model Minimization in Markov Decision , 1997 .

[3]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[4]  R. Bellman Dynamic programming. , 1957, Science.

[5]  Blai Bonet,et al.  Learning Depth-First Search: A Unified Approach to Heuristic Search in Deterministic and Non-Deterministic Settings, and Its Application to MDPs , 2006, ICAPS.

[6]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[7]  Luis Enrique Sucar,et al.  Abstraction and Refinement for Solving Continuous Markov Decision Processes , 2006, Probabilistic Graphical Models.

[8]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[9]  Robert Givan,et al.  Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[10]  Leslie Pack Kaelbling,et al.  On the Complexity of Solving Markov Decision Problems , 1995, UAI.

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Scott Sherwood Benson,et al.  Learning action models for reactive autonomous agents , 1996 .

[13]  Milos Hauskrecht,et al.  Linear Program Approximations for Factored Continuous-State Markov Decision Processes , 2003, NIPS.

[14]  Eduardo F. Morales,et al.  Scaling Up Reinforcement Learning with a Relational Representation , 2003 .

[15]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[16]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[17]  Joelle Pineau,et al.  Policy-contingent abstraction for robust robot control , 2002, UAI.

[18]  Martijn van Otterlo,et al.  A survey of reinforcement learning in relational domains , 2005 .

[19]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[20]  Moisés Goldszmidt,et al.  Action Networks: A Framework for Reasoning about Actions and Change under Uncertainty , 1994, UAI.