论文信息 - Solving Relational MDPs with First-Order Machine Learning - 字舞流文

Solving Relational MDPs with First-Order Machine Learning

We present a new formulation of Relational Markov Decision Processes (RMDPs) which is simpler than the situationcalculus approach of Boutilier, Reiter and Price. In addition, we describe our initial efforts developing a novel, machinelearning based method for computing an RMDP’s policy. Our technique instantiates the RMDP into a number of propositional MDPs, which are then solved for their value functions. First-order regression techniques are then used to learn a value function for the complete RMDP. This value function may then be used to produce a policy for huge decisiontheoretic planning problems, outputting compact solutions without actually requiring explicit state space enumeration. Finally, we extend our RMDP formalism to cover the case of a dynamic universe, i.e. in which action effects may create new objects or destroy existing ones.

[1] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..

[2] Randal E. Bryant,et al. Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[3] David E. Smith. Controlling Backward Inference , 1989, Artif. Intell..

[4] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .

[5] Brian Falkenhainer,et al. Dynamic Constraint Satisfaction Problems , 1990, AAAI.

[6] V. S. Subrahmanian,et al. Probabilistic Logic Programming , 1992, Inf. Comput..

[7] Mark A. Peot,et al. Conditional nonlinear planning , 1992 .

[8] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[9] Robert P. Goldman,et al. From knowledge bases to decision models , 1992, The Knowledge Engineering Review.

[10] Oren Etzioni,et al. An Approach to Planning with Incomplete Information , 1992, KR.

[11] Robert P. Goldman,et al. Conditional Linear Planning , 1994, AIPS.

[12] Daniel S. Weld,et al. Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.

[13] Oren Etzioni,et al. Omnipotence without Omniscience: Sensor Management in Planning , 1994, AAAI 1994.

[14] Daniel S. Weld. An Introduction to Least Commitment Planning , 1994, AI Mag..

[15] Peter Haddawy,et al. Generating Bayesian Networks from Probablity Logic Knowledge Bases , 1994, UAI.

[16] Drew McDermott,et al. Modeling a Dynamic and Uncertain World I: Symbolic and Probabilistic Reasoning About Change , 1994, Artif. Intell..

[17] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[18] Avrim Blum,et al. Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[19] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[20] S. Muggleton. Stochastic Logic Programs , 1996 .

[21] Mark A. Peot,et al. Suspending Recursion in Causal-Link Planning , 1996, AIPS.

[22] Stefan Kramer,et al. Structural Regression Trees , 1996, AAAI/IAAI, Vol. 1.

[23] Gregg Collins,et al. Planning for Contingencies: A Decision-based Approach , 1996, J. Artif. Intell. Res..

[24] David E. Smith,et al. Conformant Graphplan , 1998, AAAI/IAAI.

[25] David E. Smith,et al. Extending Graphplan to handle uncertainty and sensing actions , 1998, AAAI 1998.

[26] Jim Blythe,et al. Planning Under Uncertainty in Dynamic Domains , 1998 .

[27] Ronen I. Brafman,et al. Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.

[28] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.

[29] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[30] Lise Getoor,et al. Learning Probabilistic Relational Models , 1999, IJCAI.

[31] John Langford,et al. Probabilistic Planning in the Graphplan Framework , 1999, ECP.

[32] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[33] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.

[34] Marco Roveri,et al. Conformant Planning via Symbolic Model Checking , 2000, J. Artif. Intell. Res..

[35] Blai Bonet,et al. Planning with Incomplete Information as Heuristic Search in Belief Space , 2000, AIPS.

[36] Ben Taskar,et al. Learning Probabilistic Models of Relational Structure , 2001, ICML.

[37] Luc De Raedt,et al. Adaptive Bayesian Logic Programs , 2001, ILP.

[38] Carlos Guestrin,et al. Max-norm Projections for Factored MDPs , 2001, IJCAI.

[39] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[40] Luc De Raedt,et al. Towards Combining Inductive Logic Programming with Bayesian Networks , 2001, ILP.

[41] Pedro M. Domingos,et al. Relational Markov models and their application to adaptive web navigation , 2002, KDD.

[42] Piergiorgio Bertoli,et al. Improving Heuristics for Planning as Search in Belief Space , 2002, AIPS.

[43] Zhengzhu Feng,et al. Symbolic heuristic search for factored Markov decision processes , 2002, AAAI/IAAI.

[44] Robert Givan,et al. Inductive Policy Selection for First-Order MDPs , 2002, UAI.

[45] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.

[46] Pedro M. Domingos,et al. Dynamic Probabilistic Relational Models , 2003, IJCAI.

[47] Ivan Bratko,et al. First Order Regression , 1997, Machine Learning.