RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning
暂无分享,去创建一个
[1] Shie Mannor,et al. Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty , 2012, ICML.
[2] Marek Petrik,et al. Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds , 2012, ICML.
[3] Andrey Bernstein,et al. Adaptive Aggregation for Reinforcement Learning with Efficient Exploration: Deterministic Domains , 2008, COLT.
[4] J. Tsitsiklis,et al. Robust, risk-sensitive, and data-driven control of markov decision processes , 2007 .
[5] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[6] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[7] Vivek F. Farias,et al. Approximate Dynamic Programming via a Smoothed Linear Program , 2009, Oper. Res..
[8] Robert L. Smith,et al. Aggregation in Dynamic Programming , 1987, Oper. Res..
[9] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[10] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Marek Petrik,et al. Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation , 2013, UAI.
[13] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[14] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[15] Marek Petrik,et al. Constraint relaxation in approximate linear programs , 2009, ICML '09.
[16] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[17] Benjamin Van Roy. Performance Loss Bounds for Approximate Value Iteration with State Aggregation , 2006, Math. Oper. Res..
[18] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[19] Evan L. Porteus. Foundations of Stochastic Inventory Theory , 2002 .
[20] Daniel Kuhn,et al. Robust Markov Decision Processes , 2013, Math. Oper. Res..
[21] Peter Bro Miltersen,et al. Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor , 2010, JACM.
[22] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[23] Andrew J. Schaefer,et al. Robust Modified Policy Iteration , 2013, INFORMS J. Comput..
[24] Marek Petrik,et al. Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.