Sampling Based Approaches for Minimizing Regret in Uncertain Markov Decision Processes (MDPs)
暂无分享,去创建一个
Patrick Jaillet | Pradeep Varakantham | Yossiri Adulyasak | Asrar Ahmed | Meghna Lowalekar | Patrick Jaillet | Pradeep Varakantham | Asrar Ahmed | Meghna Lowalekar | Y. Adulyasak
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Michael H. Bowling,et al. Tractable Objectives for Robust Policy Optimization , 2012, NIPS.
[3] Patrick Jaillet,et al. Loss bounds for uncertain transition probabilities in Markov decision processes , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[4] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[5] R. Bellman. A Markovian Decision Process , 1957 .
[6] Giuseppe Carlo Calafiore,et al. Uncertain convex programs: randomized solutions and confidence levels , 2005, Math. Program..
[7] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[8] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[9] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[10] Patrick Jaillet,et al. Regret based Robust Solutions for Uncertain Markov Decision Processes , 2013, NIPS.
[11] David Hsu,et al. Monte Carlo Bayesian Reinforcement Learning , 2012, ICML.
[12] A. Shapiro. Monte Carlo Sampling Methods , 2003 .
[13] Shie Mannor,et al. Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach , 2015, NIPS.
[14] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[15] Shie Mannor,et al. Parametric regret in uncertain Markov decision processes , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[16] Daniel Kuhn,et al. Robust Markov Decision Processes , 2013, Math. Oper. Res..
[17] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[18] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[19] Craig Boutilier,et al. Regret-based Reward Elicitation for Markov Decision Processes , 2009, UAI.
[20] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[21] Robert Givan,et al. Bounded Parameter Markov Decision Processes , 1997, ECP.
[22] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[23] Craig Boutilier,et al. Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies , 2010, AAAI.
[24] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[25] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[26] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[27] Shie Mannor,et al. Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty , 2012, ICML.
[28] Rémi Munos,et al. Optimistic Planning in Markov Decision Processes Using a Generative Model , 2014, NIPS.
[29] J. Schreiber. Foundations Of Statistics , 2016 .
[30] Andrew Y. Ng,et al. Solving Uncertain Markov Decision Processes , 2001 .