Processos de Decisão de Markov: um tutorial
暂无分享,去创建一个
[1] Fabio Gagliardi Cozman,et al. Unifying Nondeterministic and Probabilistic Planning Through Imprecise Markov Decision Processes , 2006, IBERAMIA-SBIA.
[2] Larry D. Pyeatt,et al. A Parallel Algorithm for POMDP Solution , 1999, ECP.
[3] Shlomo Zilberstein,et al. Region-Based Incremental Pruning for POMDPs , 2004, UAI.
[4] Eric A. Hansen,et al. An Improved Grid-Based Approximation Algorithm for POMDPs , 2001, IJCAI.
[5] Stuart E. Dreyfus,et al. Applied Dynamic Programming , 1965 .
[6] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[7] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[8] Kevin D. Seppi,et al. Prioritization Methods for Accelerating MDP Solvers , 2005, J. Mach. Learn. Res..
[9] James S. Kakalik,et al. OPTIMUM POLICIES FOR PARTIALLY OBSERVABLE MARKOV SYSTEMS , 1965 .
[10] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[11] Chelsea C. White,et al. Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..
[12] A. Cassandra. A Survey of POMDP Applications , 2003 .
[13] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[14] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[15] Craig Boutilier,et al. Who's asking for help?: a Bayesian approach to intelligent assistance , 2006, IUI '06.
[16] Eric A. Hansen,et al. Synthesis of Hierarchical Finite-State Controllers for POMDPs , 2003, ICAPS.
[17] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[18] Trey Smith,et al. Probabilistic planning for robotic exploration , 2007 .
[19] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.
[20] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[21] Weihong Zhang,et al. Restricted Value Iteration: Theory and Algorithms , 2011, J. Artif. Intell. Res..
[22] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[23] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[24] Reid G. Simmons,et al. Focused Real-Time Dynamic Programming for MDPs: Squeezing More Out of a Heuristic , 2006, AAAI.
[25] Pascal Poupart,et al. Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..
[26] R. Bellman. A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .
[27] Joelle Pineau,et al. Tractable planning under uncertainty: exploiting structure , 2004 .
[28] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[29] D. J. White,et al. A Survey of Applications of Markov Decision Processes , 1993 .
[30] Oliver Alfred Gross,et al. On the Optimal Inventory Equation , 1955 .
[31] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[32] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[33] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[34] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[35] Leslie Pack Kaelbling,et al. Approximate Planning in POMDPs with Macro-Actions , 2003, NIPS.
[36] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[37] D. Bertsekas,et al. Approximate solution methods for partially observable markov and semi-markov decision processes , 2006 .
[38] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[39] Shlomo Zilberstein,et al. Solving POMDPs using quadratically constrained linear programs , 2006, AAMAS '06.
[40] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.
[41] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[42] Kevin P. Murphy,et al. A Survey of POMDP Solution Techniques , 2000 .
[43] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[44] Reid G. Simmons,et al. Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.
[45] Claudia V. Goldman,et al. Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..
[46] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[47] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[48] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[49] Geoffrey J. Gordon,et al. Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.
[50] Hideaki Itoh,et al. Partially observable Markov decision processes with imprecise parameters , 2007, Artif. Intell..
[51] Milos Hauskrecht,et al. Planning and control in stochastic domains with imperfect information , 1997 .
[52] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[53] Chelsea C. White,et al. Markov Decision Processes with Imprecise Transition Probabilities , 1994, Oper. Res..
[54] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[55] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[56] Chelsea C. White,et al. Procedures for the Solution of a Finite-Horizon, Partially Observed, Semi-Markov Optimization Problem , 1976, Oper. Res..
[57] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[58] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[59] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.
[60] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[61] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[62] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.