Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play
暂无分享,去创建一个
Jiankun Hu | Kathryn Kasmarik | Sherif M. Abdelfattah | Kathryn E. Kasmarik | Sherif Abdelfattah | Jiankun Hu
[1] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[2] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[3] Patrice Perny,et al. On Minimizing Ordered Weighted Regrets in Multiobjective Markov Decision Processes , 2011, ADT.
[4] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[5] Shimon Whiteson,et al. Linear support for multi-objective coordination graphs , 2014, AAMAS.
[6] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[7] Kathryn E. Merrick,et al. Motivated Reinforcement Learning - Curious Characters for Multiuser Games , 2009 .
[8] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[9] Peter Geibel,et al. Reinforcement Learning for MDPs with Constraints , 2006, ECML.
[10] Eyke Hüllermeier,et al. Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm , 2014, Machine Learning.
[11] Marcello Restelli,et al. A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run , 2013 .
[12] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[13] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[14] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[15] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[16] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[17] S. M. Arnsten. Intrinsic motivation. , 1990, The American journal of occupational therapy : official publication of the American Occupational Therapy Association.
[18] Patrice Perny,et al. On Finding Compromise Solutions in Multiobjective Markov Decision Processes , 2010, ECAI.
[19] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[20] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[21] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[22] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[23] Konkoly Thege. Multi-criteria Reinforcement Learning , 1998 .
[24] Shimon Whiteson,et al. Point-Based Planning for Multi-Objective POMDPs , 2015, IJCAI.
[25] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[26] Yuichiro Yoshikawa,et al. Intrinsically motivated reinforcement learning for human-robot interaction in the real-world , 2018, Neural Networks.
[27] Susan A. Murphy,et al. Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , 2010, ICML.
[28] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[29] Paul M. B. Vitányi,et al. An Introduction to Kolmogorov Complexity and Its Applications, Third Edition , 1997, Texts in Computer Science.
[30] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[31] Martin Ester,et al. Density‐based clustering , 2019, WIREs Data Mining Knowl. Discov..
[32] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[33] E. Deci,et al. Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.
[34] Eugene A. Feinberg,et al. Constrained Markov Decision Models with Weighted Discounted Rewards , 1995, Math. Oper. Res..
[35] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[36] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[37] A. Shamsai,et al. Multi-objective Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.
[38] Nicola Beume,et al. On the Complexity of Computing the Hypervolume Indicator , 2009, IEEE Transactions on Evolutionary Computation.
[39] Lotfi A. Zadeh,et al. Fuzzy logic = computing with words , 1996, IEEE Trans. Fuzzy Syst..
[40] Michèle Sebag,et al. Preference-Based Policy Learning , 2011, ECML/PKDD.
[41] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.