Planning and Acting in Partially Observable Stochastic Domains

[1]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[2]  Peter Haddawy,et al.  Utility Models for Goal‐Directed, Decision‐Theoretic Planners , 1998, Comput. Intell..

[3]  Michael L. Littman,et al.  MAXPLAN: A New Approach to Probabilistic Planning , 1998, AIPS.

[4]  A. Cassandra,et al.  Exact and approximate algorithms for partially observable markov decision processes , 1998 .

[5]  Eric A. Hansen,et al.  An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.

[6]  Munindar P. Singh,et al.  Readings in agents , 1997 .

[7]  Michael L. Littman,et al.  Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[8]  Craig Boutilier,et al.  Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.

[9]  Craig Boutilier,et al.  Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[10]  Gregg Collins,et al.  Planning for Contingencies: A Decision-based Approach , 1996, J. Artif. Intell. Res..

[11]  T. Dean,et al.  Generating optimal policies for high-level plans with conditional branches and loops , 1996 .

[12]  Pattie Maes,et al.  Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .

[13]  M. Paterson,et al.  The complexity of mean payo games on graphs , 1996 .

[14]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[15]  Wenju Liu,et al.  Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .

[16]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[17]  Uri Zwick,et al.  The Complexity of Mean Payoff Games on Graphs , 1996, Theor. Comput. Sci..

[18]  M. Littman,et al.  Efficient dynamic-programming updates in partially observable Markov decision processes , 1995 .

[19]  Uri Zwick,et al.  The Complexity of Mean Payoff Games , 1995, COCOON.

[20]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[21]  Andrew McCallum,et al.  Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.

[22]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[23]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[24]  Leslie Pack Kaelbling,et al.  Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..

[25]  David E. Smith,et al.  Representation and Evaluation of Plans with Loops , 1995 .

[26]  M. Littman The Witness Algorithm: Solving Partially Observable Markov Decision Processes , 1994 .

[27]  Eric A. Hansen,et al.  Cost-Effective Sensing during Plan Execution , 1994, AAAI.

[28]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[29]  Stuart J. Russell,et al.  Control Strategies for a Stochastic Planner , 1994, AAAI.

[30]  Robert P. Goldman,et al.  Epsilon-Safe Planning , 1994, UAI.

[31]  Jim Blythe,et al.  Planning with External Events , 1994, UAI.

[32]  Michael L. Littman,et al.  Memoryless policies: theoretical limitations and practical results , 1994 .

[33]  Daniel S. Weld,et al.  Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.

[34]  Robert P. Goldman,et al.  Conditional Linear Planning , 1994, AIPS.

[35]  Robert P. Goldman,et al.  Representing Uncertainty in Simple Planners , 1994, KR.

[36]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[37]  Reid G. Simmons,et al.  Risk-Sensitive Planning with Probabilistic Decision Graphs , 1994, KR.

[38]  Hector J. Levesque,et al.  The Frame Problem and Knowledge-Producing Actions , 1993, AAAI.

[39]  Todd Michael Mansell,et al.  A method for Planning Given Uncertain and Incomplete Information , 1993, UAI.

[40]  Andrew McCallum,et al.  Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[41]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[42]  Ronald J. Williams,et al.  Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .

[43]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[44]  Daniel S. Weld,et al.  UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[45]  Lonnie Chrisman,et al.  Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[46]  James A. Hendler,et al.  Proceedings of the first international conference on Artificial intelligence planning systems , 1992 .

[47]  Mark A. Peot,et al.  Conditional nonlinear planning , 1992 .

[48]  Anne Condon,et al.  The Complexity of Stochastic Games , 1992, Inf. Comput..

[49]  Sven Koenig,et al.  Optimal Probabilistic and Decision-Theoretic Planning using Markovian , 1992 .

[50]  David A. McAllester,et al.  Systematic Nonlinear Planning , 1991, AAAI.

[51]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[52]  Steven I. Marcus,et al.  On the computation of the optimal cost function for discrete time Markov models with partial observations , 1991, Ann. Oper. Res..

[53]  Ari Arapostathis,et al.  On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes , 1991, Ann. Oper. Res..

[54]  L. N. Kanal,et al.  Uncertainty in Artificial Intelligence 5 , 1990 .

[55]  P. Tseng Solving H-horizon, stationary Markov decision problems in time proportional to log(H) , 1990 .

[56]  John L. Bresina,et al.  Anytime Synthetic Projection: Maximizing the Probability of Goal Satisfaction , 1990, AAAI.

[57]  Chelsea C. White,et al.  Solution Procedures for Partially Observed Markov Decision Processes , 1989, Oper. Res..

[58]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[59]  Leora Morgenstern,et al.  Knowledge Preconditions for Actions and Plans , 1988, IJCAI.

[60]  Leora Morgenstem Knowledge preconditions for actions and plans , 1987, IJCAI 1987.

[61]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[62]  Robert C. Moore,et al.  Formal Theories of the Commonsense World , 1985 .

[63]  James N. Eagle The Optimal Search for a Moving Target When the Search Path Is Constrained , 1984, Oper. Res..

[64]  Robert C. Moore A Formal Theory of Knowledge and Action , 1984 .

[65]  G. Grisetti,et al.  Further Reading , 1984, IEEE Spectrum.

[66]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[67]  C. White,et al.  Application of Jensen's inequality to adaptive suboptimal design , 1980 .

[68]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[69]  K. Sawaki,et al.  OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON , 1978 .

[70]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[71]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[72]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[73]  L. Goddard,et al.  Operations Research (OR) , 2007 .

[74]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[75]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[76]  R. Howard Dynamic Programming and Markov Processes , 1960 .

[77]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .