论文信息 - Stochastic dynamic programming with factored representations - 字舞流文

Stochastic dynamic programming with factored representations

Craig Boutilier | Richard Dearden | Moisés Goldszmidt | M. Goldszmidt | Craig Boutilier | R. Dearden

[1] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[2] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[3] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[4] Ronen I. Brafman,et al. Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.

[5] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.

[6] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.

[7] Konkoly Thege. Multi-criteria Reinforcement Learning , 1998 .

[8] R. Sutton,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors , 1998 .

[9] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.

[10] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.

[11] Robert Givan,et al. Model Minimization, Regression, and Propositional STRIPS Planning , 1997, IJCAI.

[12] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.

[13] Craig Boutilier,et al. Correlated Action Effects in Decision Theoretic Regression , 1997, UAI.

[14] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[15] David Poole,et al. The Independent Choice Logic for Modelling Multiple Agents Under Uncertainty , 1997, Artif. Intell..

[16] Craig Boutilier,et al. Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..

[17] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[18] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.

[19] Craig Boutilier,et al. The Frame Problem and Bayesian Network Action Representation , 1996, Canadian Conference on AI.

[20] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21] Craig Boutilier,et al. Approximate Value Trees in Structured Dynamic Programming , 1996, ICML.

[22] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[23] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .

[24] Craig Boutilier,et al. Process-Oriented Planning and Average-Reward Optimality , 1995, IJCAI.

[25] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[26] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[27] Stuart J. Russell,et al. Control Strategies for a Stochastic Planner , 1994, AAAI.

[28] Craig Boutilier,et al. Using Abstractions for Decision-Theoretic Planning with Time Constraints , 1994, AAAI.

[29] Moisés Goldszmidt,et al. Action Networks: A Framework for Reasoning about Actions and Change under Uncertainty , 1994, UAI.

[30] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[31] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[32] Drew McDermott,et al. Modeling a Dynamic and Uncertain World I: Symbolic and Probabilistic Reasoning About Change , 1994, Artif. Intell..

[33] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[34] R. I. Bahar,et al. Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[35] David Poole,et al. Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[36] Leslie Pack Kaelbling,et al. Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[37] Solomon Eyal Shimony,et al. The role of relevance in explanation I: Irrelevance as statistical independence , 1993, Int. J. Approx. Reason..

[38] James E. Smith,et al. Structuring Conditional Relationships in Influence Diagrams , 1993, Oper. Res..

[39] Daniel S. Weld,et al. UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[40] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[41] Edmund M. Clarke,et al. Symbolic Model Checking: 10^20 States and Beyond , 1990, Inf. Comput..

[42] E. Clarke,et al. Symbolic Model Checking : IO * ’ States and Beyond * , 1992 .

[43] David Heckerman,et al. Advances in Probabilistic Reasoning , 1994, Conference on Uncertainty in Artificial Intelligence.

[44] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[45] David A. McAllester,et al. Systematic Nonlinear Planning , 1991, AAAI.

[46] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[47] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[48] Ross D. Shachter,et al. Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..

[49] Drew V. Mcdermott,et al. Projecting plans for uncertain worlds , 1990 .

[50] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[51] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .

[52] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .

[53] Ronald A. Howard,et al. Readings on the Principles and Applications of Decision Analysis , 1989 .

[54] Marcel Schoppers,et al. Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[55] David Chapman,et al. Planning for Conjunctive Goals , 1987, Artif. Intell..

[56] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[57] Ross D. Shachter. Evaluating Influence Diagrams , 1986, Oper. Res..

[58] Randal E. Bryant,et al. Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[59] Paul J. Schweitzer,et al. Iterative Aggregation-Disaggregation Procedures for Discounted Semi-Markov Reward Processes , 1985, Oper. Res..

[60] A. Prasad Sistla,et al. Automatic verification of finite state concurrent system using temporal logic specifications: a practical approach , 1983, POPL '83.

[61] John McCarthy,et al. SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[62] R. L. Keeney,et al. Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[63] M. Puterman,et al. Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .

[64] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[65] Jerome A. Feldman,et al. Decision Theory and Artificial Intelligence II: The Hungry Monkey , 1977, Cogn. Sci..

[66] Richard Waldinger,et al. Achieving several goals simultaneously , 1977 .

[67] Ronald L. Rivest,et al. Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[68] Earl D. Sacerdoti,et al. The Nonlinear Nature of Plans , 1975, IJCAI.

[69] Edsger W. Dijkstra,et al. Guarded commands, nondeterminacy and formal derivation of programs , 1975, Commun. ACM.

[70] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[71] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[72] Donald Michie,et al. Machine Intelligence 4 , 1970 .

[73] R. Bellman. Dynamic programming. , 1957, Science.

[74] J. Hartmanis,et al. Algebraic Structure Theory Of Sequential Machines , 1966 .

[75] R. Howard. Dynamic Programming and Markov Processes , 1960 .