暂无分享,去创建一个
[1] Homanga Bharadhwaj,et al. Continual Model-Based Reinforcement Learning with Hypernetworks , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[2] Nikhil Ketkar,et al. Introduction to PyTorch , 2021, Deep Learning with Python.
[3] Benjamin F. Grewe,et al. Meta-Learning via Hypernetworks , 2020 .
[4] Tomer Galanti,et al. On the Modularity of Hypernetworks , 2020, NeurIPS.
[5] Animesh Garg,et al. D2RL: Deep Dense Architectures in Reinforcement Learning , 2020, ArXiv.
[6] Yoram Louzoun,et al. Explicit Gradient Learning for Black-Box Optimization , 2020, ICML.
[7] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[8] Yee Whye Teh,et al. Multiplicative Interactions and Where to Find Them , 2020, ICLR.
[9] Hod Lipson,et al. Principled Weight Initialization for Hypernetworks , 2020, ICLR.
[10] Lior Wolf,et al. Comparing the Parameter Complexity of Hypernetworks and the Embedding-Based Alternative , 2020, ArXiv.
[11] Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies , 2020, ICLR.
[12] Alex Smola,et al. Meta-Q-Learning , 2019, ICLR.
[13] Benjamin F. Grewe,et al. Continual learning with hypernetworks , 2019, ICLR.
[14] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[15] Larry Rudolph,et al. A Closer Look at Deep Policy Gradients , 2018, ICLR.
[16] Lior Wolf,et al. Deep Meta Functionals for Shape Representation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[17] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[18] Erik Nijkamp,et al. A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks , 2019, ArXiv.
[19] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[20] Fuxin Li,et al. HyperGAN: A Generative Model for Diverse, Performant Neural Networks , 2019, ICML.
[21] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[22] Razvan Pascanu,et al. Meta-Learning with Latent Embedding Optimization , 2018, ICLR.
[23] Context-Based Meta-Reinforcement Learning with Structured Latent Space , 2019 .
[24] Saeed Saremi,et al. On approximating ∇f with neural networks , 2019, ArXiv.
[25] Alexey Potapov,et al. HyperNets and their application to learning spatial transformations , 2018, ICANN.
[26] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[27] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[28] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[29] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[30] Theodore Lim,et al. SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.
[31] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[32] Jaime G. Carbonell,et al. The exploding gradient problem demystified - definition, prevalence, impact, origin, tradeoffs, and solutions , 2017 .
[33] Richard S. Sutton,et al. A Deeper Look at Experience Replay , 2017, ArXiv.
[34] Takayuki Okatani,et al. HyperNetworks with statistical filtering for defending adversarial examples , 2017, ArXiv.
[35] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[36] Li Zhang,et al. Learning to Learn: Meta-Critic Networks for Sample Efficient Learning , 2017, ArXiv.
[37] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[38] Quoc V. Le,et al. HyperNetworks , 2016, ICLR.
[39] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[41] Luc Van Gool,et al. Dynamic Filter Networks , 2016, NIPS.
[42] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[43] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[44] Jianfeng Gao,et al. Deep Reinforcement Learning with a Natural Language Action Space , 2015, ACL.
[45] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[46] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[47] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[48] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.
[49] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[50] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[51] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[52] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[53] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[54] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[55] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[56] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[57] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[58] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[59] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[60] Jürgen Schmidhuber,et al. Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.
[61] James L. McClelland. Putting Knowledge in its Place: A Scheme for Programming Parallel Processing Structures on the Fly , 1988, Cogn. Sci..