暂无分享,去创建一个
[1] Vladimir Braverman,et al. The Benefits of Implicit Regularization from SGD in Least Squares Problems , 2021, ArXiv.
[2] Erez Karpas,et al. Generalized Planning With Deep Reinforcement Learning , 2020, ArXiv.
[3] Christos Dimitrakakis,et al. Bayesian Reinforcement Learning via Deep, Sparse Sampling , 2020, AISTATS.
[4] Ohad Shamir,et al. Learnability, Stability and Uniform Convergence , 2010, J. Mach. Learn. Res..
[5] Ron Meir,et al. Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory , 2017, ICML.
[6] J. W. Nieuwenhuis,et al. Boekbespreking van D.P. Bertsekas (ed.), Dynamic programming and optimal control - volume 2 , 1999 .
[7] Chelsea Finn,et al. Meta-Learning with Fewer Tasks through Task Interpolation , 2021, ArXiv.
[8] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[9] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[10] Shie Mannor,et al. Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs , 2020, AAAI.
[11] Andreas Krause,et al. PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees , 2020, ICML.
[12] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[13] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[14] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[15] Brian F. Hutton,et al. What is the distribution of the number of unique original items in a bootstrap sample , 2016, 1602.05822.
[16] Dimitris S. Papailiopoulos,et al. Stability and Generalization of Learning Algorithms that Converge to Global Optima , 2017, ICML.
[17] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[18] Shie Mannor,et al. Contextual Markov Decision Processes , 2015, ArXiv.
[19] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[20] Yu Zhang,et al. Multi-Task Learning and Algorithmic Stability , 2015, AAAI.
[21] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[22] Amir Beck,et al. First-Order Methods in Optimization , 2017 .
[23] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[24] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[25] Peter Dayan,et al. Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search , 2012, NIPS.
[26] Aviv Tamar,et al. Offline Meta Learning of Exploration , 2020 .
[27] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[28] Vicenç Gómez,et al. A unified view of entropy-regularized Markov decision processes , 2017, ArXiv.
[29] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[30] Shimon Whiteson,et al. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning , 2020, ICLR.
[31] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[32] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[33] Anirudha Majumdar,et al. PAC-BUS: Meta-Learning Bounds via PAC-Bayes and Uniform Stability , 2021, ArXiv.
[34] Ilya Kostrikov,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ArXiv.
[35] Craig Boutilier,et al. Differentiable Meta-Learning of Bandit Policies , 2020, NeurIPS.