Dynamic preferences in multi-criteria reinforcement learning
暂无分享,去创建一个
[1] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .
[2] Anne Lohrli. Chapman and Hall , 1985 .
[3] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[4] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[5] Eugene A. Feinberg,et al. Constrained Markov Decision Models with Weighted Discounted Rewards , 1995, Math. Oper. Res..
[6] Prasad Tadepalli,et al. Model-Based Average Reward Reinforcement Learning , 1998, Artif. Intell..
[7] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[8] Csaba Szepesvári,et al. Multi-criteria Reinforcement Learning , 1998, ICML.
[9] Ronald Parr,et al. Flexible Decomposition Algorithms for Weakly Coupled Markov Decision Problems , 1998, UAI.
[10] E. Altman. Constrained Markov Decision Processes , 1999 .
[11] Keith W. Ross,et al. Computer networking - a top-down approach featuring the internet , 2000 .
[12] Peter Stone. TPOT-RL Applied to Network Routing , 2000, ICML.
[13] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[14] Daphne Koller,et al. Learning an Agent's Utility Function by Observing Behavior , 2001, ICML.
[15] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.
[16] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[17] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.
[18] Bharat K. Bhargava,et al. Study of distance vector routing protocols for mobile ad hoc networks , 2003, Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003. (PerCom 2003)..
[19] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.
[20] Shie Mannor,et al. A Geometric Approach to Multi-Criterion Reinforcement Learning , 2004, J. Mach. Learn. Res..
[21] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[22] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.