Multi-objectivization of reinforcement learning problems by reward shaping
暂无分享,去创建一个
Matthew E. Taylor | Tim Brys | Daniel Kudenko | Peter Vrancx | Ann Nowé | Anna Harutyunyan | D. Kudenko | T. Brys | A. Harutyunyan | A. Nowé | Peter Vrancx
[1] Mikkel T. Jensen,et al. Helper-objectives: Using multi-objective evolutionary algorithms for single-objective optimisation , 2004, J. Math. Model. Algorithms.
[2] Julian Togelius,et al. The Mario AI Benchmark and Competitions , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[3] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[4] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[5] Jonathan E. Fieldsend,et al. Optimizing Decision Trees Using Multi-objective Particle Swarm Optimization , 2009 .
[6] John N. Tsitsiklis,et al. Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..
[7] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[8] Xiaodong Li,et al. Evolutionary algorithms and multi-objectivization for the travelling salesman problem , 2009, GECCO.
[9] Csaba Szepesvári,et al. Multi-criteria Reinforcement Learning , 1998, ICML.
[10] J. Dennis,et al. A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multicriteria optimization problems , 1997 .
[11] Frank Neumann,et al. Do additional objectives make a problem harder? , 2007, GECCO '07.
[12] James S. Albus,et al. Brains, behavior, and robotics , 1981 .
[13] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.
[14] V. G. Zhadan,et al. Exact auxiliary functions in optimization problems , 1991 .
[15] Richard A. Watson,et al. Reducing Local Optima in Single-Objective Problems by Multi-objectivization , 2001, EMO.
[16] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Sushil J. Louis,et al. Pareto Optimality , GA-easiness and Deception , 2012 .
[19] Joshua D. Knowles,et al. Multiobjectivization by Decomposition of Scalar Cost Functions , 2008, PPSN.
[20] Matthew E. Taylor,et al. Adaptive objective selection for correlated objectives in multi-objective reinforcement learning , 2014, AAMAS.
[21] Marco Wiering,et al. Ensemble Algorithms in Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[22] Daniel Kudenko,et al. Using plan-based reward shaping to learn strategies in StarCraft: Broodwar , 2013, 2013 IEEE Conference on Computational Inteligence in Games (CIG).
[23] Kazutoshi Sakakibara,et al. Multi-objective approaches in a single-objective optimization environment , 2005, 2005 IEEE Congress on Evolutionary Computation.
[24] Sam Devlin,et al. An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems , 2011, Adv. Complex Syst..
[25] Arina Buzdalova,et al. Generation of Tests for Programming Challenge Tasks Using Helper-Objectives , 2013, SSBSE.
[26] Carlos A. Coello Coello,et al. Swarm Intelligence for Multi-objective Problems in Data Mining , 2009 .
[27] Kalyanmoy Deb,et al. Trading on infeasibility by exploiting constraint’s criticality through multi-objectivization: A system design perspective , 2007, 2007 IEEE Congress on Evolutionary Computation.
[28] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[29] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[30] Sushil J. Louis,et al. Pareto OptimalityGA-Easiness and Deception (Extended Abstract) , 1993, International Conference on Genetic Algorithms.
[31] Arina Buzdalova,et al. Increasing Efficiency of Evolutionary Algorithms by Choosing between Auxiliary Fitness Functions with Reinforcement Learning , 2012, 2012 11th International Conference on Machine Learning and Applications.
[32] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[33] A. H. Klopf,et al. Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .
[34] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[35] Michael L. Littman,et al. An Ensemble of Linearly Combined Reinforcement-Learning Agents , 2013, AAAI.