Teaching Multiple Tasks to an RL Agent using LTL
暂无分享,去创建一个
Sheila A. McIlraith | Richard Anthony Valenzano | Toryn Q. Klassen | Rodrigo Toro Icarte | R. Valenzano
[1] Gregory Kuhlmann and Peter Stone and Raymond J. Mooney and Shavlik. Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer , 2004, AAAI 2004.
[2] Craig Boutilier,et al. Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.
[3] Amir Pnueli,et al. The temporal logic of programs , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).
[4] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[5] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[6] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[7] Alessandro Lazaric,et al. Transfer from Multiple MDPs , 2011, NIPS.
[8] John K. Slaney,et al. Decision-Theoretic Planning with non-Markovian Rewards , 2011, J. Artif. Intell. Res..
[9] Daniel Kroening,et al. Logically-Constrained Reinforcement Learning , 2018, 1801.08099.
[10] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[11] P. Stone,et al. TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.
[12] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[13] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[14] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[15] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[16] Karen M. Feigh,et al. Learning From Explanations Using Sentiment and Advice in RL , 2017, IEEE Transactions on Cognitive and Developmental Systems.
[17] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[18] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Ufuk Topcu,et al. Environment-Independent Task Specifications via GLTL , 2017, ArXiv.
[21] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[22] Fahiem Bacchus,et al. Using temporal logics to express search control knowledge for planning , 2000, Artif. Intell..
[23] Nick Hawes,et al. Optimal Policy Generation for Partially Satisfiable Co-Safe LTL Specifications , 2015, IJCAI.
[24] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[25] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[26] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[27] Ronen I. Brafman,et al. LTLf/LDLf Non-Markovian Rewards , 2018, AAAI.
[28] Guan Wang,et al. Interactive Learning from Policy-Dependent Human Feedback , 2017, ICML.
[29] Daniel Kroening,et al. Logically-Correct Reinforcement Learning , 2018, ArXiv.
[30] Matthew E. Taylor,et al. Integrating Human Demonstration and Reinforcement Learning : Initial Results in Human-Agent Transfer , 2010 .
[31] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[32] Marcus Hutter,et al. Multi-task reinforcement learning : shaping and feature selection , 2011 .
[33] David L. Roberts,et al. Training an Agent to Ground Commands with Reward and Punishment , 2014, AAAI 2014.
[34] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[35] Calin Belta,et al. Reinforcement learning with temporal logic rewards , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[36] Nick Hawes,et al. Optimal and dynamic planning for Markov decision processes with co-safe LTL specifications , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[37] Tanaka Fumihide,et al. Multitask Reinforcement Learning on the Distribution of MDPs , 2003 .
[38] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[39] Alessandro Lazaric,et al. Bayesian Multi-Task Reinforcement Learning , 2010, ICML.
[40] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[41] Sheila A. McIlraith,et al. Advice-Based Exploration in Model-Based Reinforcement Learning , 2018, Canadian Conference on AI.
[42] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[43] Orna Kupferman,et al. Model Checking of Safety Properties , 1999, CAV.
[44] Jude W. Shavlik,et al. Creating Advice-Taking Reinforcement Learners , 1998, Machine Learning.
[45] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[46] Peter Stone,et al. Cross-domain transfer for reinforcement learning , 2007, ICML '07.
[47] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[48] Ufuk Topcu,et al. Probably Approximately Correct Learning in Stochastic Games with Temporal Logic Specifications , 2016, IJCAI.
[49] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Understanding How People Want to Teach Robots , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.
[50] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[51] Scott Sanner,et al. Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping , 2021, SOCS.
[52] Matthias Scheutz,et al. What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution , 2009, 2009 IEEE International Conference on Robotics and Automation.
[53] Ufuk Topcu,et al. Learning from Demonstrations with High-Level Side Information , 2017, IJCAI.