暂无分享,去创建一个
[1] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .
[2] George B. Dantzig,et al. Linear programming and extensions , 1965 .
[3] Herbert A. Simon,et al. The Sciences of the Artificial , 1970 .
[4] H. Kushner,et al. Decomposition of systems governed by Markov chains , 1974 .
[5] J. Birge. Solution methods for stochastic dynamic linear programs , 1980 .
[6] Nesa L'abbe Wu,et al. Linear programming and extensions , 1981 .
[7] Mark Stefik. H. A. Simon, The Sciences of the Artificial , 1984, Artif. Intell..
[8] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .
[9] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .
[10] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.
[13] Avi Pfeffer,et al. Object-Oriented Bayesian Networks , 1997, UAI.
[14] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[15] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[16] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[17] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[18] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[19] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[20] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.
[21] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[22] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[23] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[24] Benjamin Van Roy,et al. Approximate Dynamic Programming via Linear Programming , 2001, NIPS.