Automated Hierarchy Discovery for Planning in Partially Observable Domains
暂无分享,去创建一个
[1] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[2] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[3] Thomas G. Dietterich,et al. Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..
[4] Richard E. Korf,et al. Planning as Search: A Quantitative Approach , 1987, Artif. Intell..
[5] Leslie Pack Kaelbling,et al. Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation , 2005 .
[6] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[7] Michael A. Saunders,et al. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization , 2002, SIAM J. Optim..
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] I. Grossmann. Review of Nonlinear Mixed-Integer and Disjunctive Programming Techniques , 2002 .
[10] Richard S. Sutton,et al. On the Significance of Markov Decision Processes , 1997, ICANN.
[11] Sven Leyffer,et al. User manual for filterSQP , 1998 .
[12] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..
[13] Eric A. Hansen,et al. Synthesis of Hierarchical Finite-State Controllers for POMDPs , 2003, ICAPS.
[14] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[15] Gérard Cornuéjols,et al. An algorithmic framework for convex mixed integer nonlinear programs , 2008, Discret. Optim..
[16] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[17] Sridhar Mahadevan,et al. Hierarchical Memory-Based Reinforcement Learning , 2000, NIPS.
[18] Josh D. Tenenberg,et al. Chapter 4 – Abstraction in Planning , 1991 .
[19] A. Ruszczynski,et al. Nonlinear Optimization , 2006 .
[20] Yoram Singer,et al. The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.
[21] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[22] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[23] Sridhar Mahadevan,et al. Hierarchical Policy Gradient Algorithms , 2003, ICML.
[24] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[25] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[26] R. Bellman. A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .
[27] Joelle Pineau,et al. Tractable planning under uncertainty: exploiting structure , 2004 .
[28] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[29] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[30] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[31] Thomas Hofmann,et al. Automated Hierarchy Discovery for Planning in Partially Observable Environments , 2007 .
[32] Hanif D. Sherali,et al. Disjunctive Programming , 2009, Encyclopedia of Optimization.
[33] Shlomo Zilberstein,et al. Solving POMDPs using quadratically constrained linear programs , 2006, AAMAS '06.
[34] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[35] I. Nowak. Relaxation and Decomposition Methods for Mixed Integer Nonlinear Programming , 2005 .
[36] Sven Leyffer,et al. Integrating SQP and Branch-and-Bound for Mixed Integer Nonlinear Programming , 2001, Comput. Optim. Appl..
[37] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[38] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[39] A. Neumaier. Complete search in continuous global optimization and constraint satisfaction , 2004, Acta Numerica.
[40] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.
[41] Craig A. Knoblock,et al. A Theory of Abstraction for Hierarchical Planning , 1990 .
[42] George Kuttickal Chacko. Operations research/management science , 1993 .
[43] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[44] Judy Goldsmith,et al. Nonapproximability Results for Partially Observable Markov Decision Processes , 2011, Universität Trier, Mathematik/Informatik, Forschungsbericht.
[45] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[46] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[47] R. Saigal,et al. Handbook of semidefinite programming : theory, algorithms, and applications , 2000 .
[48] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[49] Alvin W Drake,et al. Observation of a Markov process through a noisy channel , 1962 .
[50] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[51] Joelle Pineau,et al. A Hierarchical Approach to POMDP Planning and Execution , 2004 .