Planning With Uncertain Specifications (PUnS)
暂无分享,去创建一个
Shen Li | Julie Shah | Ankit Shah | Ankit J. Shah | J. Shah | Shen Li
[1] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[2] Sheila A. McIlraith,et al. Teaching Multiple Tasks to an RL Agent using LTL , 2018, AAMAS.
[3] Alberto Camacho,et al. Finite LTL Synthesis as Planning , 2018, ICAPS.
[4] Alberto Camacho,et al. Strong Fully Observable Non-Deterministic Planning with LTL and LTLf Goals , 2019, IJCAI.
[5] Jorge A. Baier,et al. A Heuristic Search Approach to Planning with Temporally Extended Preferences , 2007, IJCAI.
[6] Joseph Kim,et al. Collaborative Planning with Encoding of Users' High-Level Strategies , 2017, AAAI.
[7] Craig Boutilier,et al. Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies , 2010, AAAI.
[8] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[9] Scott Sanner,et al. Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping , 2021, SOCS.
[10] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[11] Patrik Haslum,et al. Deterministic planning in the fifth international planning competition: PDDL3 and experimental evaluation of the planners , 2009, Artif. Intell..
[12] Shen Li,et al. Bayesian Inference of Temporal Task Specifications from Demonstrations , 2018, NeurIPS.
[13] Sheila A. McIlraith,et al. Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning , 2018, ICML.
[14] Calin Belta,et al. Q-Learning for robust satisfaction of signal temporal logic specifications , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[15] Orna Kupferman,et al. Model Checking of Safety Properties , 1999, Formal Methods Syst. Des..
[16] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[17] Fred Kröger,et al. Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.
[18] Fahiem Bacchus,et al. Using temporal logics to express search control knowledge for planning , 2000, Artif. Intell..
[19] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[20] Nick Hawes,et al. Optimal Policy Generation for Partially Satisfiable Co-Safe LTL Specifications , 2015, IJCAI.
[21] Moshe Y. Vardi. An Automata-Theoretic Approach to Linear Temporal Logic , 1996, Banff Higher Order Workshop.
[22] Anca D. Dragan,et al. Simplifying Reward Design through Divide-and-Conquer , 2018, Robotics: Science and Systems.
[23] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[24] Anca D. Dragan,et al. Active Preference-Based Learning of Reward Functions , 2017, Robotics: Science and Systems.
[25] Hadas Kress-Gazit,et al. Temporal-Logic-Based Reactive Mission and Motion Planning , 2009, IEEE Transactions on Robotics.
[26] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[27] Alberto Camacho,et al. LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning , 2019, IJCAI.
[28] Christian Muise,et al. Bayesian Inference of Linear Temporal Logic Specifications for Contrastive Explanations , 2019, IJCAI.
[29] Pierre Wolper,et al. Simple on-the-fly automatic verification of linear temporal logic , 1995, PSTV.
[30] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[31] Anca D. Dragan,et al. Inverse Reward Design , 2017, NIPS.
[32] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[33] Derek Long,et al. Plan Constraints and Preferences in PDDL3 , 2006 .
[34] Stefanie Tellex,et al. Planning with State Abstractions for Non-Markovian Task Specifications , 2019, Robotics: Science and Systems.
[35] Ufuk Topcu,et al. Environment-Independent Task Specifications via GLTL , 2017, ArXiv.
[36] Craig Boutilier,et al. Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.