暂无分享,去创建一个
John Salvatier | David Abel | Owain Evans | Andreas Stuhlmüller | J. Salvatier | Owain Evans | Andreas Stuhlmüller | David Abel
[1] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[2] Stefanie Tellex,et al. Goal-Based Action Priors , 2015, ICAPS.
[3] Jude W. Shavlik,et al. Creating Advice-Taking Reinforcement Learners , 1998, Machine Learning.
[4] Doina Precup,et al. Methods for Computing State Similarity in Markov Decision Processes , 2006, UAI.
[5] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[6] W. B. Knox. Augmenting Reinforcement Learning with Human Feedback , 2011 .
[7] Eric Wiewiora,et al. Potential-Based Shaping and Q-Value Initialization are Equivalent , 2003, J. Artif. Intell. Res..
[8] Garrison W. Cottrell,et al. Principled Methods for Advising Reinforcement Learning Agents , 2003, ICML.
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[11] Jianfeng Gao,et al. Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, ArXiv.
[12] Shlomo Zilberstein,et al. Reinforcement Learning for Mixed Open-loop and Closed-loop Control , 1996, NIPS.
[13] David L. Roberts,et al. A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans , 2016, AAMAS.
[14] Thomas G. Dietterich,et al. Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.
[15] Yishay Mansour,et al. Approximate Equivalence of Markov Decision Processes , 2003, COLT.
[16] Lisa A. Torrey. Help an Agent Out : Student / Teacher Learning in Sequential Decision Tasks , 2011 .
[17] Sanmit Narvekar,et al. Learning in Reinforcement Learning , 2017 .
[18] Vladimir Vapnik,et al. On the Theory of Learnining with Privileged Information , 2010, NIPS.
[19] Zachary Chase Lipton,et al. Combating Deep Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, 1611.01211.
[20] Peter Stone,et al. Improving Action Selection in MDP's via Knowledge Transfer , 2005, AAAI.
[21] Pradyot V. N. Korupolu,et al. Beyond Rewards : Learning from Richer Supervision , 2011 .
[22] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[23] Shimon Whiteson,et al. Alternating Optimisation and Quadrature for Robust Control , 2016, AAAI.
[24] Saso Dzeroski,et al. Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.
[25] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[26] Francisco Javier García-Polo,et al. Safe reinforcement learning in high-risk tasks through policy improvement , 2011, ADPRL.
[27] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[28] Ofra Amir,et al. Interactive Teaching Strategies for Agent Training , 2016, IJCAI.
[29] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[30] Yusen Zhan,et al. Theoretically-Grounded Policy Advice from Multiple Teachers in Reinforcement Learning Settings with Applications to Negative Transfer , 2016, IJCAI.
[31] David L. Roberts,et al. Learning something from nothing: Leveraging implicit human feedback strategies , 2014, The 23rd IEEE International Symposium on Robot and Human Interactive Communication.
[32] Oliver Kroemer,et al. Active Reward Learning , 2014, Robotics: Science and Systems.
[33] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[34] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[35] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[36] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[37] Michael L. Littman,et al. Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.
[38] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[39] Javier García,et al. Safe Exploration of State and Action Spaces in Reinforcement Learning , 2012, J. Artif. Intell. Res..
[40] Vladimir Vapnik,et al. A new learning paradigm: Learning using privileged information , 2009, Neural Networks.
[41] Clayton T. Morrison,et al. Blending Autonomous Exploration and Apprenticeship Learning , 2011, NIPS.
[42] Benjamin Rosman,et al. What good are actions? Accelerating learning using learned action priors , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[43] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.
[44] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[45] Steffen Udluft,et al. Safe exploration for reinforcement learning , 2008, ESANN.
[46] Shimon Whiteson,et al. Alternating Optimisation and Quadrature for Robust Reinforcement Learning , 2016, ArXiv.
[47] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[48] William R. Swartout. Virtual Humans as Centaurs: Melding Real and Virtual , 2016, HCI.
[49] Ronald Ortner,et al. Noname manuscript No. (will be inserted by the editor) Adaptive Aggregation for Reinforcement Learning in Average Reward Markov Decision Processes , 2022 .
[50] Matthew E. Taylor,et al. Teaching on a budget: agents advising agents in reinforcement learning , 2013, AAMAS.
[51] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[52] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[53] Marlos C. Machado,et al. Domain-Independent Optimistic Initialization for Reinforcement Learning , 2014, AAAI Workshop: Learning for General Competency in Video Games.