Linear Backprop in non-linear networks
暂无分享,去创建一个
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.
[3] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[4] Barak A. Pearlmutter,et al. Tricks from Deep Learning , 2016, ArXiv.
[5] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[6] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[7] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[8] Geoffrey E. Hinton,et al. Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures , 2018, NeurIPS.
[9] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[10] Lei Wu,et al. Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes , 2017, ArXiv.
[11] Sepp Hochreiter,et al. Self-Normalizing Neural Networks , 2017, NIPS.
[12] Léon Bottou,et al. Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.
[13] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.
[14] Pierre Baldi,et al. A theory of local learning, the learning channel, and the optimality of backpropagation , 2015, Neural Networks.
[15] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[16] Colin J. Akerman,et al. Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.
[17] Jiri Matas,et al. All you need is a good init , 2015, ICLR.
[18] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[19] Misha Denil,et al. Noisy Activation Functions , 2016, ICML.
[20] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[21] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.