暂无分享,去创建一个
Marc G. Bellemare | Ofir Nachum | Carles Gelada | Saurabh Kumar | Jacob Buckman | Saurabh Kumar | Carles Gelada | J. Buckman | Ofir Nachum | Jacob Buckman
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[3] A. Müller. Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.
[4] Adrian S. Lewis,et al. Convex Analysis And Nonlinear Optimization , 2000 .
[5] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[6] Maria L. Rizzo,et al. TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION , 2004 .
[7] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[8] Karl Hinderer,et al. Lipschitz Continuity of Value Functions in Markovian Decision Processes , 2005, Math. Methods Oper. Res..
[9] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[10] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[11] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[12] C. Villani. Optimal Transport: Old and New , 2008 .
[13] Doina Precup,et al. Using Bisimulation for Policy Transfer in MDPs , 2010, AAAI.
[14] Doina Precup,et al. Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics , 2011, AAAI.
[15] Doina Precup,et al. Bisimulation Metrics for Continuous Markov Decision Processes , 2011, SIAM J. Comput..
[16] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..
[17] Kenji Fukumizu,et al. Equivalence of distance-based and RKHS-based statistics in hypothesis testing , 2012, ArXiv.
[18] Nan Jiang,et al. Abstraction Selection in Model-based Reinforcement Learning , 2015, ICML.
[19] Luca Bascetta,et al. Policy gradient in Lipschitz Markov Decision Processes , 2015, Machine Learning.
[20] Doina Precup,et al. Representation Discovery for MDPs Using Bisimulation Metrics , 2015, AAAI.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[23] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[24] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.
[25] Michael L. Littman,et al. Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[28] Razvan Pascanu,et al. Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.
[29] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[30] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[31] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[32] Marc G. Bellemare,et al. The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.
[33] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[34] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[35] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[36] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[37] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[38] Duy Nguyen-Tuong,et al. Probabilistic Recurrent State-Space Models , 2018, ICML.
[39] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[40] Sergey Levine,et al. SOLAR: Deep Structured Latent Representations for Model-Based Reinforcement Learning , 2018, ArXiv.
[41] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[42] Arthur Gretton,et al. Demystifying MMD GANs , 2018, ICLR.
[43] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[44] Sergey Levine,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[45] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[46] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[47] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[48] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[49] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[50] Marc G. Bellemare,et al. A Comparative Analysis of Expected and Distributional Reinforcement Learning , 2019, AAAI.
[51] Nicolas Le Roux,et al. The Value Function Polytope in Reinforcement Learning , 2019, ICML.
[52] Nicolas Le Roux,et al. A Geometric Perspective on Optimal Representations for Reinforcement Learning , 2019, NeurIPS.
[53] Yuandong Tian,et al. Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees , 2018, ICLR.
[54] Joelle Pineau,et al. Combined Reinforcement Learning via Abstract Representations , 2018, AAAI.
[55] Marc G. Bellemare,et al. Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift , 2019, AAAI.
[56] Martha White,et al. Two-Timescale Networks for Nonlinear Value Function Approximation , 2019, ICLR.
[57] Sergey Levine,et al. SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.
[58] Yoshua Bengio,et al. Hyperbolic Discounting and Learning over Multiple Horizons , 2019, ArXiv.
[59] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[60] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[61] Bernhard Pfahringer,et al. Regularisation of neural networks by enforcing Lipschitz continuity , 2018, Machine Learning.