Stochastic dynamic programming with factored representations

[1]  Jesse Hoey,et al.  SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[2]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[3]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[4]  Ronen I. Brafman,et al.  Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.

[5]  Kee-Eung Kim,et al.  Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.

[6]  Doina Precup,et al.  Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.

[7]  Konkoly Thege Multi-criteria Reinforcement Learning , 1998 .

[8]  R. Sutton,et al.  Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors , 1998 .

[9]  Satinder P. Singh,et al.  How to Dynamically Merge Markov Decision Processes , 1997, NIPS.

[10]  Ronen I. Brafman,et al.  Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.

[11]  Robert Givan,et al.  Model Minimization, Regression, and Propositional STRIPS Planning , 1997, IJCAI.

[12]  Robert Givan,et al.  Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.

[13]  Craig Boutilier,et al.  Correlated Action Effects in Decision Theoretic Regression , 1997, UAI.

[14]  Robert Givan,et al.  Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[15]  David Poole,et al.  The Independent Choice Logic for Modelling Multiple Agents Under Uncertainty , 1997, Artif. Intell..

[16]  Craig Boutilier,et al.  Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..

[17]  Craig Boutilier,et al.  Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[18]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[19]  Craig Boutilier,et al.  The Frame Problem and Bayesian Network Action Representation , 1996, Canadian Conference on AI.

[20]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21]  Craig Boutilier,et al.  Approximate Value Trees in Structured Dynamic Programming , 1996, ICML.

[22]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[23]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[24]  Craig Boutilier,et al.  Process-Oriented Planning and Average-Reward Optimality , 1995, IJCAI.

[25]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[26]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[27]  Stuart J. Russell,et al.  Control Strategies for a Stochastic Planner , 1994, AAAI.

[28]  Craig Boutilier,et al.  Using Abstractions for Decision-Theoretic Planning with Time Constraints , 1994, AAAI.

[29]  Moisés Goldszmidt,et al.  Action Networks: A Framework for Reasoning about Actions and Change under Uncertainty , 1994, UAI.

[30]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[31]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[32]  Drew McDermott,et al.  Modeling a Dynamic and Uncertain World I: Symbolic and Probabilistic Reasoning About Change , 1994, Artif. Intell..

[33]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[34]  R. I. Bahar,et al.  Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[35]  David Poole,et al.  Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[36]  Leslie Pack Kaelbling,et al.  Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[37]  Solomon Eyal Shimony,et al.  The role of relevance in explanation I: Irrelevance as statistical independence , 1993, Int. J. Approx. Reason..

[38]  James E. Smith,et al.  Structuring Conditional Relationships in Influence Diagrams , 1993, Oper. Res..

[39]  Daniel S. Weld,et al.  UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[40]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[41]  Edmund M. Clarke,et al.  Symbolic Model Checking: 10^20 States and Beyond , 1990, Inf. Comput..

[42]  E. Clarke,et al.  Symbolic Model Checking : IO * ’ States and Beyond * , 1992 .

[43]  David Heckerman,et al.  Advances in Probabilistic Reasoning , 1994, Conference on Uncertainty in Artificial Intelligence.

[44]  Leslie Pack Kaelbling,et al.  Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[45]  David A. McAllester,et al.  Systematic Nonlinear Planning , 1991, AAAI.

[46]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[47]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[48]  Ross D. Shachter,et al.  Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..

[49]  Drew V. Mcdermott,et al.  Projecting plans for uncertain worlds , 1990 .

[50]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[51]  D. Bertsekas,et al.  Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .

[52]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[53]  Ronald A. Howard,et al.  Readings on the Principles and Applications of Decision Analysis , 1989 .

[54]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[55]  David Chapman,et al.  Planning for Conjunctive Goals , 1987, Artif. Intell..

[56]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[57]  Ross D. Shachter Evaluating Influence Diagrams , 1986, Oper. Res..

[58]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[59]  Paul J. Schweitzer,et al.  Iterative Aggregation-Disaggregation Procedures for Discounted Semi-Markov Reward Processes , 1985, Oper. Res..

[60]  A. Prasad Sistla,et al.  Automatic verification of finite state concurrent system using temporal logic specifications: a practical approach , 1983, POPL '83.

[61]  John McCarthy,et al.  SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[62]  R. L. Keeney,et al.  Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[63]  M. Puterman,et al.  Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .

[64]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[65]  Jerome A. Feldman,et al.  Decision Theory and Artificial Intelligence II: The Hungry Monkey , 1977, Cogn. Sci..

[66]  Richard Waldinger,et al.  Achieving several goals simultaneously , 1977 .

[67]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[68]  Earl D. Sacerdoti,et al.  The Nonlinear Nature of Plans , 1975, IJCAI.

[69]  Edsger W. Dijkstra,et al.  Guarded commands, nondeterminacy and formal derivation of programs , 1975, Commun. ACM.

[70]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[71]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[72]  Donald Michie,et al.  Machine Intelligence 4 , 1970 .

[73]  R. Bellman Dynamic programming. , 1957, Science.

[74]  J. Hartmanis,et al.  Algebraic Structure Theory Of Sequential Machines , 1966 .

[75]  R. Howard Dynamic Programming and Markov Processes , 1960 .