Understanding and correcting pathologies in the training of learned optimizers
暂无分享,去创建一个
Jeremy Nixon | Jascha Sohl-Dickstein | Niru Maheswaranathan | Luke Metz | C. Daniel Freeman | J. Sohl-Dickstein | Luke Metz | Niru Maheswaranathan | Jeremy Nixon | C. Freeman | Jascha Narain Sohl-Dickstein
[1] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .
[2] Y. Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .
[3] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[4] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[5] J. Fleiss. Review papers : The statistical basis of meta-analysis , 1993 .
[6] Juergen Schmidhuber,et al. On learning how to learn learning strategies , 1994 .
[7] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..
[8] Yoshua Bengio,et al. Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.
[9] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[10] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[11] David E. Goldberg,et al. Genetic algorithms and Machine Learning , 1988, Machine Learning.
[12] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[13] Yoshua Bengio,et al. On the Optimization of a Synaptic Learning Rule , 2007 .
[14] G. Evans,et al. Learning to Optimize , 2008 .
[15] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[16] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[17] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[18] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[19] Justin Domke,et al. Generic Methods for Optimization-Based Modeling , 2012, AISTATS.
[20] David Barber,et al. Variational Optimization , 2012, ArXiv.
[21] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[22] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[23] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[24] Guillaume Charpiat,et al. Training recurrent networks online without backtracking , 2015, ArXiv.
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Ryan P. Adams,et al. Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.
[27] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[28] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[29] Lukasz Kaiser,et al. Neural GPUs Learn Algorithms , 2015, ICLR.
[30] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[31] Quoc V. Le,et al. Neural Optimizer Search with Reinforcement Learning , 2017, ICML.
[32] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.
[33] Jitendra Malik,et al. Learning to Optimize Neural Nets , 2017, ArXiv.
[34] Jian Li,et al. Learning Gradient Descent: Better Generalization and Longer Horizons , 2017, ICML.
[35] Misha Denil,et al. Learned Optimizers that Scale and Generalize , 2017, ICML.
[36] Yann Ollivier,et al. Unbiasing Truncated Backpropagation Through Time , 2017, ArXiv.
[37] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[38] Paolo Frasconi,et al. Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.
[39] Carl E. Rasmussen,et al. PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos , 2019, ICML.
[40] Jascha Sohl-Dickstein,et al. Learning Unsupervised Learning Rules , 2018, ArXiv.
[41] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[42] Renjie Liao,et al. Understanding Short-Horizon Bias in Stochastic Meta-Optimization , 2018, ICLR.
[43] Richard S. Zemel,et al. Aggregated Momentum: Stability Through Passive Damping , 2018, ICLR.
[44] Jascha Sohl-Dickstein,et al. Meta-Learning Update Rules for Unsupervised Representation Learning , 2018, ICLR.