Interactive Value Iteration for Markov Decision Processes with Unknown Rewards
暂无分享,去创建一个
[1] Craig Boutilier,et al. Regret-based Reward Elicitation for Markov Decision Processes , 2009, UAI.
[2] Jaap Van Brakel,et al. Foundations of measurement , 1983 .
[3] Moshe Shaked,et al. Stochastic orders and their applications , 1994 .
[4] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[5] Craig Boutilier,et al. Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation , 2002, UAI.
[6] Craig Boutilier,et al. Online feature elicitation in interactive optimization , 2009, ICML '09.
[7] Craig Boutilier,et al. Robust Online Optimization of Reward-Uncertain MDPs , 2011, IJCAI.
[8] Shie Mannor,et al. Parametric regret in uncertain Markov decision processes , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[9] A. Tversky,et al. Foundations of Measurement, Vol. I: Additive and Polynomial Representations , 1991 .
[10] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[11] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[12] Craig Boutilier,et al. Recommendation Sets and Choice Queries: There Is No Exploration/Exploitation Tradeoff! , 2011, AAAI.
[13] Paul Weng,et al. Markov Decision Processes with Ordinal Rewards: Reference Point-Based Preferences , 2011, ICAPS.
[14] Jesse Hoey,et al. A planning system based on Markov decision processes to guide people with dementia through activities of daily living , 2006, IEEE Transactions on Information Technology in Biomedicine.
[15] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .
[16] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[17] Craig Boutilier,et al. Eliciting Additive Reward Functions for Markov Decision Processes , 2011, IJCAI.
[18] Craig Boutilier,et al. Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies , 2010, AAAI.
[19] Patrick Suppes,et al. Foundations of measurement , 1971 .
[20] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[21] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.