论文信息 - Q-Managed: A new algorithm for a multiobjective reinforcement learning

Q-Managed: A new algorithm for a multiobjective reinforcement learning

Abstract Multi-objective reinforcement learning involves the use of reinforcement learning techniques to address problems with multiple objectives. To resolve this, we use a hybrid multi-objective optimization method that provides the mathematical guarantee that all policies belonging to the Pareto Front can be found. The hybridization gave rise to Q-Managed, which is given by the ϵ − constraint method and the Q-Learning algorithm, where the first limits the environment dynamically based on the agent’s learning. Thus, when a region no longer provides improvement, it becomes a constraint, preventing the agent from returning. The simplicity and its performance come from a single-policy algorithms.

Adrião Duarte Dória Neto | Jorge Dantas de Melo | Thiago Henrique Freire de Oliveira | Luiz Paulo de Souza Medeiros

[1] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..

[2] Patrice Perny,et al. On Finding Compromise Solutions in Multiobjective Markov Decision Processes , 2010, ECAI.

[3] Ying Han,et al. A Q-learning-based memetic algorithm for multi-objective dynamic software project scheduling , 2018, Inf. Sci..

[4] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[5] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .

[6] Kalyanmoy Deb,et al. Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[7] Jorge Dantas de Melo,et al. Q-Managed: A new algorithm for a multiobjective reinforcement learning , 2020, Expert Syst. Appl..

[8] Hisao Ishibuchi,et al. An easy-to-use real-world multi-objective optimization problem suite , 2020, Appl. Soft Comput..

[9] Kaisa Miettinen,et al. On scalarizing functions in multiobjective optimization , 2002, OR Spectr..

[10] Peter Vamplew,et al. Softmax exploration strategies for multiobjective reinforcement learning , 2017, Neurocomputing.

[11] Gary B. Lamont,et al. Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[12] Ann Nowé,et al. Multi-objective reinforcement learning using sets of pareto dominating policies , 2014, J. Mach. Learn. Res..

[13] Dewen Hu,et al. Multiobjective Reinforcement Learning: A Comprehensive Overview , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15] Avi Ostfeld,et al. Multi-objective optimization of water quality, pumps operation, and storage sizing of water distribution systems. , 2013, Journal of environmental management.

[16] Thomas Bäck,et al. Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[17] Marco Wiering,et al. Model-based multi-objective reinforcement learning , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[18] Ann Nowé,et al. Hypervolume-Based Multi-Objective Reinforcement Learning , 2013, EMO.

[19] Kalyanmoy Deb,et al. Introduction to Evolutionary Multiobjective Optimization , 2008, Multiobjective Optimization.

[20] Srini Narayanan,et al. Learning all optimal policies with multiple criteria , 2008, ICML '08.

[21] David Levine,et al. Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.

[22] Peter Vamplew,et al. Steering approaches to Pareto-optimal multiobjective reinforcement learning , 2017, Neurocomputing.

[23] Yacov Y. Haimes,et al. Multiobjective Decision Making: Theory and Methodology , 1983 .

[24] J. Dennis,et al. A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multicriteria optimization problems , 1997 .

[25] Rui Wang,et al. Deep Reinforcement Learning for Multiobjective Optimization , 2019, IEEE Transactions on Cybernetics.