暂无分享,去创建一个
Daniel Kroening | Mohammadhosein Hasanbeig | Alessandro Abate | D. Kroening | A. Abate | Mohammadhosein Hasanbeig
[1] Nir Piterman. From Nondeterministic Büchi and Streett Automata to Deterministic Parity Automata , 2007, Log. Methods Comput. Sci..
[2] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[3] Ufuk Topcu,et al. Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints , 2014, Robotics: Science and Systems.
[4] E. Feinberg,et al. An Inequality for Variances of the Discounted Rewards , 2009, Journal of Applied Probability.
[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[6] Rajeev Alur,et al. Deterministic generators and games for LTL fragments , 2001, Proceedings 16th Annual IEEE Symposium on Logic in Computer Science.
[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[8] Marco Wiering,et al. Reinforcement Learning and Markov Decision Processes , 2012, Reinforcement Learning.
[9] Orna Kupferman,et al. Model Checking of Safety Properties , 1999, Formal Methods Syst. Des..
[10] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..
[11] Wei Ren,et al. Game theory control solution for sensor coverage problem in unknown environment , 2014, 53rd IEEE Conference on Decision and Control.
[12] Joost-Pieter Katoen,et al. Quantitative model-checking of controlled discrete-time Markov processes , 2014, Inf. Comput..
[13] Jan Kretínský,et al. MoChiBA: Probabilistic LTL Model Checking Using Limit-Deterministic Büchi Automata , 2016, ATVA.
[14] Calin Belta,et al. Reinforcement learning with temporal logic rewards , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[15] Jan Kretínský,et al. Limit-Deterministic Büchi Automata for Linear Temporal Logic , 2016, CAV.
[16] Calin Belta,et al. Optimal path planning for surveillance with temporal-logic constraints* , 2011, Int. J. Robotics Res..
[17] Marta Z. Kwiatkowska,et al. PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.
[18] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[19] Ufuk Topcu,et al. Robust control of uncertain Markov Decision Processes with temporal logic specifications , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[20] Christel Baier,et al. Principles of model checking , 2008 .
[21] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[22] John N. Tsitsiklis,et al. Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[23] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[24] Mohammadhosein Hasanbeig,et al. On Synchronous Binary Log-Linear Learning and Second Order Q-learning , 2017 .
[25] S. Safra,et al. On the complexity of omega -automata , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.
[26] Calin Belta,et al. Optimal control of MDPs with temporal logic constraints , 2013, 52nd IEEE Conference on Decision and Control.
[27] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[28] S. Shankar Sastry,et al. A learning based approach to control synthesis of Markov decision processes for linear temporal logic specifications , 2014, 53rd IEEE Conference on Decision and Control.
[29] Sebastian Junges,et al. Safety-Constrained Reinforcement Learning for MDPs , 2015, TACAS.
[30] Ufuk Topcu,et al. Safe Reinforcement Learning via Shielding , 2017, AAAI.
[31] Krishnendu Chatterjee,et al. Value Iteration for Long-Run Average Reward in Markov Decision Processes , 2017, CAV.
[32] R. Durrett. Essentials of Stochastic Processes , 1999 .
[33] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[34] Xu Chu Ding,et al. Strategic planning under uncertainties via constrained Markov Decision Processes , 2013, 2013 IEEE International Conference on Robotics and Automation.
[35] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[36] Ufuk Topcu,et al. Correct-by-synthesis reinforcement learning with temporal logic constraints , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[37] Marta Z. Kwiatkowska,et al. Pareto Curves for Probabilistic Model Checking , 2012, ATVA.
[38] Calin Belta,et al. Motion planning and control from temporal logic specifications with probabilistic satisfaction guarantees , 2010, 2010 IEEE International Conference on Robotics and Automation.
[39] Mohammadhosein Hasanbeig,et al. Multi-agent Learning in Coverage Control Games , 2016 .
[40] Amir Pnueli,et al. The temporal logic of programs , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).
[41] Patrick Doherty,et al. Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization , 2015, AAAI.