Knowledge-based reward shaping with knowledge revision in reinforcement learning
暂无分享,去创建一个
[1] Mark A. Peot,et al. Conditional nonlinear planning , 1992 .
[2] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[5] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[6] Jude W. Shavlik,et al. Advice Refinement in Knowledge-Based SVMs , 2011, NIPS.
[7] Jude W. Shavlik,et al. Refining Rules Incorporated into Knowledge-Based Support Vector Learners Via Successive Linear Programming , 2007, AAAI.
[8] M. Grzes,et al. Plan-based reward shaping for reinforcement learning , 2008, 2008 4th International IEEE Conference Intelligent Systems.
[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[10] Mong-Li Lee,et al. Distributed relational temporal difference learning , 2013, AAMAS.
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Amal El Fallah Seghrouchni,et al. Multi-Agent Planning , 2012, Software Agents, Agent Systems and Their Applications.
[13] Hector Muñoz-Avila,et al. RETALIATE: Learning Winning Policies in First-Person Shooter Games , 2007, AAAI.
[14] PETER GÄRDENFORS,et al. Belief Revision: Belief revision: An introduction , 2003 .
[15] Gabriele Kern-Isberner,et al. Conditionals in Nonmonotonic Reasoning and Belief Revision: Considering Conditionals as Agents , 2001 .
[16] Gabriele Kern-Isberner,et al. Combining Reinforcement Learning and Belief Revision - A Learning System for Active Vision , 2008, BMVC.
[17] Kurt Driessens,et al. Relational Reinforcement Learning , 1998, Machine-mediated learning.
[18] S. Rosenschein,et al. On social laws for artificial agent societies: off-line design , 1996 .
[19] Daniel Kudenko,et al. Multigrid Reinforcement Learning with Reward Shaping , 2008, ICANN.
[20] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[21] Sam Devlin,et al. An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems , 2011, Adv. Complex Syst..
[22] Héctor Muñoz-Avila,et al. CLASSQ-L: A Q-Learning Algorithm for Adversarial Real-Time Strategy Games , 2012, Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.
[23] Julia Rose Galliers. Belief Revision: Autonomous belief revision and communication , 1992 .
[24] Ashwin Ram,et al. Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.
[25] Bhaskara Marthi,et al. Automatic shaping and decomposition of reward functions , 2007, ICML '07.
[26] Garrison W. Cottrell,et al. Principled Methods for Advising Reinforcement Learning Agents , 2003, ICML.
[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[28] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[29] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[30] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[31] András Lörincz,et al. Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.
[32] Michael L. Littman,et al. Potential-based Shaping in Model-based Reinforcement Learning , 2008, AAAI.
[33] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[34] Peter Gärdenfors,et al. On the logic of theory change: Partial meet contraction and revision functions , 1985, Journal of Symbolic Logic.
[35] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.
[36] Ian D. Watson,et al. Applying reinforcement learning to small scale combat in the real-time strategy game StarCraft:Broodwar , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).
[37] Sam Devlin,et al. Theoretical considerations of potential-based reward shaping for multi-agent systems , 2011, AAMAS.
[38] Peter Norvig,et al. Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.
[39] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control 3rd Edition, Volume II , 2010 .
[40] J. Nash,et al. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.
[41] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[42] Jeffrey S. Rosenschein,et al. Synchronization of Multi-Agent Plans , 1982, AAAI.
[43] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[44] Aaron Hunter,et al. Iterated Belief Change Due to Actions and Observations , 2011, J. Artif. Intell. Res..
[45] Sam Devlin,et al. Plan-based reward shaping for multi-agent reinforcement learning , 2016, The Knowledge Engineering Review.
[46] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.