Learning to learn by gradient descent by gradient descent
暂无分享,去创建一个
Marcin Andrychowicz | Misha Denil | Sergio Gomez Colmenarejo | Matthew W. Hoffman | David Pfau | Tom Schaul | Nando de Freitas | T. Schaul | Marcin Andrychowicz | N. D. Freitas | Misha Denil | D. Pfau | David Pfau
[1] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[2] Peter R. Conwell,et al. Fixed-weight networks can learn , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[3] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[4] Richard J. Mammone,et al. Meta-neural networks that learn by learning , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[5] Jürgen Schmidhuber,et al. Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.
[6] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[7] J. Schmidhuber,et al. A neural network that embeds its own meta-levels , 1993, IEEE International Conference on Neural Networks.
[8] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[10] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..
[11] Paul Tseng,et al. An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule , 1998, SIAM J. Optim..
[12] G. V. Puskorius,et al. A signal processing framework based on dynamic neural networks with application to problems in adaptation, filtering, and classification , 1998, Proc. IEEE.
[13] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[14] Nicol N. Schraudolph,et al. Local Gain Adaptation in Stochastic Gradient Descent , 1999 .
[15] A. Steven Younger,et al. Fixed-weight on-line learning , 1999, IEEE Trans. Neural Networks.
[16] Magnus Thor Jonsson,et al. Evolution and design of distributed learning rules , 2000, 2000 IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks. Proceedings of the First IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks (Cat. No.00.
[17] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[18] Sepp Hochreiter,et al. Meta-learning with backpropagation , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).
[19] Danil V. Prokhorov,et al. Adaptive behavior with fixed weights in RNN: an overview , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).
[20] Jürgen Schmidhuber,et al. Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.
[21] Samy Bengio,et al. On the search for new learning rules for ANNs , 1995, Neural Processing Letters.
[22] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[23] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[24] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[25] Julien Mairal,et al. Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..
[26] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[27] Ted K. Ralphs,et al. Integer and Combinatorial Optimization , 2013 .
[28] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[29] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[32] S. Frick,et al. Compressed Sensing , 2014, Computer Vision, A Reference Guide.
[33] Leon A. Gatys,et al. A Neural Algorithm of Artistic Style , 2015, ArXiv.
[34] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.
[35] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[36] Sebastian Nowozin,et al. Learning Step Size Controllers for Robust Neural Network Training , 2016, AAAI.
[37] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.