Natural Neural Networks
暂无分享,去创建一个
Razvan Pascanu | Karen Simonyan | Koray Kavukcuoglu | Guillaume Desjardins | K. Kavukcuoglu | Razvan Pascanu | K. Simonyan | Guillaume Desjardins
[1] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[2] Nicol N. Schraudolph,et al. Accelerated Gradient Descent by Factor-Centering Decomposition , 1998 .
[3] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[4] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..
[5] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[6] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[7] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[8] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[9] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[10] Patrick L. Combettes,et al. Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.
[11] Jascha Sohl-Dickstein,et al. The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use , 2012, ArXiv.
[12] Klaus-Robert Müller,et al. Deep Boltzmann Machines and the Centering Trick , 2012, Neural Networks: Tricks of the Trade.
[13] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[14] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[15] Sridhar Mahadevan,et al. Projected Natural Actor-Critic , 2013, NIPS.
[16] Yann Ollivier,et al. Riemannian metrics for neural networks , 2013, ArXiv.
[17] Tapani Raiko,et al. Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities , 2013, ICLR.
[18] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[19] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[20] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.
[21] Xiaohui Zhang,et al. Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging , 2014, ICLR.
[22] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[23] Sayan Mukherjee,et al. The Information Geometry of Mirror Descent , 2013, IEEE Transactions on Information Theory.
[24] Ruslan Salakhutdinov,et al. Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix , 2015, ICML.
[25] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[26] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[28] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.