A temporal difference method for multi-objective reinforcement learning
暂无分享,去创建一个
Manuela Ruiz-Montiel | Lawrence Mandow | José-Luis Pérez-de-la-Cruz | J. Pérez-de-la-Cruz | Manuela Ruiz-Montiel | L. Mandow
[1] Christian R. Shelton,et al. Importance sampling for reinforcement learning with multiple objectives , 2001 .
[2] Susan A. Murphy,et al. Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , 2010, ICML.
[3] Michèle Sebag,et al. Hypervolume indicator and dominance reward based multi-objective Monte-Carlo Tree Search , 2013, Machine Learning.
[4] Visa Koivunen,et al. Reinforcement learning based sensing policy optimization for energy efficient cognitive radio networks , 2011, Neurocomputing.
[5] Yasuaki Kuroe,et al. Multi-objective reinforcement learning method for acquiring all pareto optimal policies simultaneously , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[6] Andrea Castelletti,et al. Reinforcement learning in the operational management of a water system , 2002 .
[7] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[8] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[9] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .
[10] Srini Narayanan,et al. Learning all optimal policies with multiple criteria , 2008, ICML '08.
[11] Manabu Yoshida,et al. Parallel reinforcement learning for weighted multi-criteria model with adaptive margin , 2007, Cognitive Neurodynamics.
[12] Andrei V. Kelarev,et al. Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks , 2009, Australasian Conference on Artificial Intelligence.
[13] Sriraam Natarajan,et al. Dynamic preferences in multi-criteria reinforcement learning , 2005, ICML.
[14] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[15] Manuela Ruiz-Montiel,et al. Design with shape grammars and reinforcement learning , 2013, Adv. Eng. Informatics.
[16] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.
[17] John Yearwood,et al. On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts , 2008, Australasian Conference on Artificial Intelligence.
[18] M.A. Wiering,et al. Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[19] Anders R. Kristensen,et al. Dynamic programming and Markov decision processes , 1996 .
[20] Joseph A. Paradiso,et al. The gesture recognition toolkit , 2014, J. Mach. Learn. Res..
[21] Lorenzo Mandow-Andaluz,et al. PQ-learning: aprendizaje por refuerzo multiobjetivo , 2013 .
[22] John S. Gero,et al. A COMPARISON OF THREE METHODS FOR GENERATING THE PARETO OPTIMAL SET , 1984 .