暂无分享,去创建一个
Ufuk Topcu | Jie Fu | Min Wen | Michael L. Littman | James MacGlashan | Charles Lee Isbell | M. Littman | C. Isbell | Min Wen | U. Topcu | J. MacGlashan | Jie Fu
[1] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[2] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[3] Ufuk Topcu,et al. Synthesis of Reactive Switching Protocols From Temporal Logic Specifications , 2013, IEEE Transactions on Automatic Control.
[4] Zohar Manna,et al. Formal verification of probabilistic systems , 1997 .
[5] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[6] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[7] U. Topcu,et al. Correct , Reactive Robot Control from Abstraction and Temporal Logic Specifications , 2011 .
[8] Calin Belta,et al. LTL Control in Uncertain Environments with Probabilistic Satisfaction Guarantees , 2011, ArXiv.
[9] Ufuk Topcu,et al. Receding Horizon Temporal Logic Planning , 2012, IEEE Transactions on Automatic Control.
[10] Orna Kupferman,et al. Discounting in LTL , 2014, TACAS.
[11] Christel Baier,et al. Principles of model checking , 2008 .
[12] Thomas A. Henzinger,et al. Discounting the Future in Systems Theory , 2003, ICALP.
[13] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML.
[14] Hadas Kress-Gazit,et al. Temporal-Logic-Based Reactive Mission and Motion Planning , 2009, IEEE Transactions on Robotics.
[15] Claude-Nicolas Fiechter,et al. Efficient reinforcement learning , 1994, COLT '94.
[16] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[17] Craig Boutilier,et al. Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.
[18] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[19] Thomas A. Henzinger,et al. Model checking discounted temporal properties , 2005, Theor. Comput. Sci..
[20] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[21] C. Belta,et al. Control of Markov decision processes from PCTL specifications , 2011, Proceedings of the 2011 American Control Conference.
[22] Ufuk Topcu,et al. Robust control of uncertain Markov Decision Processes with temporal logic specifications , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[23] Zohar Manna,et al. The Temporal Logic of Reactive and Concurrent Systems , 1991, Springer New York.
[24] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[25] Christopher G. Atkeson,et al. Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming , 1993, NIPS.
[26] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[27] Andrew W. Moore,et al. Variable Resolution Dynamic Programming , 1991, ML Workshop.
[28] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[29] Marta Z. Kwiatkowska,et al. PRISM: Probabilistic Symbolic Model Checker , 2002, Computer Performance Evaluation / TOOLS.