Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective
暂无分享,去创建一个
Razvan Pascanu | Lucian Busoniu | Claudia Clopath | Tudor Berariu | Florin Gogianu | Mihaela Rosca | L. Buşoniu | Razvan Pascanu | C. Clopath | Mihaela Rosca | Tudor Berariu | Florin Gogianu
[1] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[2] Dustin Tran,et al. Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness , 2020, NeurIPS.
[3] Moustapha Cissé,et al. Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.
[4] Tian Tian,et al. MinAtar: An Atari-Inspired Testbed for Thorough and Reproducible Reinforcement Learning Experiments , 2019 .
[5] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.
[6] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[7] Fabien Moutarde,et al. Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field , 2019 .
[8] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[9] Xiaohua Zhai,et al. A Large-Scale Study on Regularization and Normalization in GANs , 2018, ICML.
[10] David Tse,et al. Generalizable Adversarial Training via Spectral Normalization , 2018, ICLR.
[11] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[12] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[15] Peter Henderson,et al. TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning? , 2020, ArXiv.
[16] Frederik Kunstner,et al. Limitations of the empirical Fisher approximation for natural gradient descent , 2019, NeurIPS.
[17] Cem Anil,et al. Sorting out Lipschitz function approximation , 2018, ICML.
[18] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Elad Hoffer,et al. Norm matters: efficient and accurate normalization schemes in deep networks , 2018, NeurIPS.
[21] Pablo Samuel Castro,et al. Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research , 2021, ICML.
[22] Joelle Pineau,et al. Interference and Generalization in Temporal Difference Learning , 2020, ICML.
[23] Masashi Sugiyama,et al. Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks , 2018, NeurIPS.
[24] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[25] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[26] T. Weber,et al. A case for new neural network smoothness constraints , 2020, ICBINB@NeurIPS.
[27] Twan van Laarhoven,et al. L2 Regularization versus Batch and Weight Normalization , 2017, ArXiv.
[28] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[29] Kevin Scaman,et al. Lipschitz regularity of deep neural networks: analysis and efficient estimation , 2018, NeurIPS.
[30] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[31] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[32] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[33] Thomas Brox,et al. CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity , 2019, 1902.05605.
[34] G. Golub,et al. Eigenvalue computation in the 20th century , 2000 .
[35] Bernhard Pfahringer,et al. Regularisation of neural networks by enforcing Lipschitz continuity , 2018, Machine Learning.
[36] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[37] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.
[38] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[39] Arash Givchi,et al. Quasi Newton Temporal Difference Learning , 2014, ACML.
[40] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[41] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[42] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[43] Joelle Pineau,et al. Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods , 2018, ArXiv.
[44] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[45] Sanjeev Arora,et al. Theoretical Analysis of Auto Rate-Tuning by Batch Normalization , 2018, ICLR.
[46] Ritu Chadha,et al. Limitations of the Lipschitz constant as a defense against adversarial examples , 2018, Nemesis/UrbReas/SoGood/IWAISe/GDM@PKDD/ECML.
[47] Yuichi Yoshida,et al. Spectral Norm Regularization for Improving the Generalizability of Deep Learning , 2017, ArXiv.
[48] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[49] Trevor Darrell,et al. Regularization Matters in Policy Optimization -- An Empirical Study on Continuous Control. , 2020 .
[50] Tianyi Chen,et al. Adaptive Temporal Difference Learning with Linear Function Approximation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.