Trial-Based Heuristic Tree Search for MDPs with Factored Action Spaces
暂无分享,去创建一个
[1] Mario A. Nascimento,et al. Action Abstractions for Combinatorial Multi-Armed Bandit Tree Search , 2018, AIIDE.
[2] Alan Fern,et al. Dynamic Resource Allocation for Optimizing Population Diffusion , 2014, AISTATS.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Thomas Keller,et al. Anytime optimal MDP planning with trial-based heuristic tree search , 2015 .
[5] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[6] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .
[7] Florian Geißer. An Analysis of the Probabilistic Track of the IPC 2018 , 2019 .
[8] Florian Geißer,et al. PROST-DD-Utilizing Symbolic Classical Planning in THTS , 2018 .
[9] Pablo H. Ibargüengoytia,et al. Open Questions for Building Optimal Operation Policies for Dam Management Using Factored Markov Decision Processes , 2015, AAAI Fall Symposia.
[10] Malte Helmert,et al. Concise finite-domain representations for PDDL planning tasks , 2009, Artif. Intell..
[11] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.
[12] Thomas Keller,et al. PROST: Probabilistic Planning Based on UCT , 2012, ICAPS.
[13] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[14] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[15] Olivier Regnier-Coudert,et al. Evolutionary approaches to dynamic earth observation satellites mission planning under uncertainty , 2019, GECCO.
[16] Santiago Ontañón,et al. The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games , 2013, AIIDE.
[17] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[18] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[19] Jendrik Seipp,et al. From Non-Negative to General Operator Cost Partitioning , 2015, AAAI.
[20] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[21] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[22] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[23] Malte Helmert,et al. Trial-Based Heuristic Tree Search for Finite Horizon MDPs , 2013, ICAPS.
[24] Nils J. Nilsson,et al. Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.