Metrics and continuity in reinforcement learning
暂无分享,去创建一个
[1] K. Jarrod Millman,et al. Array programming with NumPy , 2020, Nat..
[2] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[3] Kellen Petersen August. Real Analysis , 2009 .
[4] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[5] A. Norets,et al. Continuity and differentiability of expected value functions in dynamic discrete choice models , 2010 .
[6] Wilson A. Sutherland,et al. Introduction to Metric and Topological Spaces , 1975 .
[7] G. vanRossum. Python reference manual , 1995 .
[8] Doina Precup,et al. Notions of State Equivalence under Partial Observability , 2009 .
[9] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[10] Rowan McAllister,et al. Learning Invariant Representations for Reinforcement Learning without Reconstruction , 2020, ICLR.
[11] Doina Precup,et al. Using Bisimulation for Policy Transfer in MDPs , 2010, AAAI.
[12] Nicolas Le Roux,et al. A Geometric Perspective on Optimal Representations for Reinforcement Learning , 2019, NeurIPS.
[13] Jason Pazis,et al. PAC Optimal Exploration in Continuous Space Markov Decision Processes , 2013, AAAI.
[14] K. I. M. McKinnon,et al. On the Generation of Markov Decision Processes , 1995 .
[15] Siddhartha Banerjee,et al. Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces , 2019, Proc. ACM Meas. Anal. Comput. Syst..
[16] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[17] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[18] Michail G. Lagoudakis,et al. On the locality of action domination in sequential decision making , 2010, ISAIM.
[19] Dongbin Zhao,et al. MEC—A Near-Optimal Online Reinforcement Learning Algorithm for Continuous Deterministic Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[20] Matthieu Geist,et al. Difference of Convex Functions Programming for Reinforcement Learning , 2014, NIPS.
[21] Nicolas Le Roux,et al. The Value Function Polytope in Reinforcement Learning , 2019, ICML.
[22] D. Bertsekas. Approximate policy iteration: a survey and some new methods , 2011 .
[23] Alexandre Proutière,et al. Exploration in Structured Reinforcement Learning , 2018, NeurIPS.
[24] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[25] Marc G. Bellemare,et al. Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces , 2020, ArXiv.
[26] Travis E. Oliphant,et al. Python for Scientific Computing , 2007, Computing in Science & Engineering.
[27] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[28] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[29] Eric Jones,et al. SciPy: Open Source Scientific Tools for Python , 2001 .
[30] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[31] Marc G. Bellemare,et al. The Value-Improvement Path: Towards Better Representations for Reinforcement Learning , 2020, AAAI.
[32] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[33] Zhao Song,et al. Efficient Model-free Reinforcement Learning in Metric Spaces , 2019, ArXiv.
[34] Doina Precup,et al. Bounding Performance Loss in Approximate MDP Homomorphisms , 2008, NIPS.
[35] Gaël Varoquaux,et al. The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.
[36] Michael L. Littman,et al. Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.
[37] Doina Precup,et al. Metrics for Markov Decision Processes with Infinite State Spaces , 2005, UAI.
[38] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[39] Pablo Samuel Castro,et al. Scalable methods for computing state similarity in deterministic Markov Decision Processes , 2019, AAAI.