Computing Factored Value Functions for Policies in Structured MDPs
暂无分享,去创建一个
[1] Edward Nelson. The adjoint Markoff process , 1958 .
[2] Ralph L. Keeney,et al. Decisions with multiple objectives: preferences and value tradeoffs , 1976 .
[3] R. L. Keeney,et al. Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[4] Audra E. Kosh,et al. Linear Algebra and its Applications , 1992 .
[5] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[6] Fahiem Bacchus,et al. Graphical models for preference and utility , 1995, UAI.
[7] Craig Boutilier,et al. Approximate Value Trees in Structured Dynamic Programming , 1996, ICML.
[8] Prasad Tadepalli,et al. Scaling Up Average Reward Reinforcement Learning by Approximating the Domain Models and the Value Function , 1996, ICML.
[9] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[10] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[11] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[12] Benjamin Van Roy. Learning and value function approximation in complex decision processes , 1998 .
[13] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[14] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[15] Balaraman Ravindran,et al. Improved Switching among Temporally Abstract Actions , 1998, NIPS.
[16] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..