Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
暂无分享,去创建一个
[1] Leslie Pack Kaelbling,et al. Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..
[2] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[4] Shlomo Zilberstein,et al. LAO*: A heuristic search algorithm that finds solutions with loops , 2001, Artif. Intell..
[5] Blai Bonet,et al. Faster Heuristic Search Algorithms for Planning with Uncertainty and Full Feedback , 2003, IJCAI.
[6] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.
[7] Anthony Stentz,et al. Focused Dynamic Programming: Extensive Comparative Results , 2004 .
[8] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[9] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[10] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[11] Geoffrey J. Gordon,et al. Fast Exact Planning in Markov Decision Processes , 2005, ICAPS.