暂无分享,去创建一个
[1] J. Rissanen. A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .
[2] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[3] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.
[4] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[5] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[6] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[7] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[8] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[9] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[10] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[11] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[12] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[13] Qiang Chen,et al. Network In Network , 2013, ICLR.
[14] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[15] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[16] Luca Rigazio,et al. Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.
[17] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[18] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[20] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[21] Samy Bengio,et al. Revisiting Distributed Synchronous SGD , 2016, ArXiv.
[22] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[23] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[24] Les E. Atlas,et al. Full-Capacity Unitary Recurrent Neural Networks , 2016, NIPS.
[25] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.
[26] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[27] Pradeep Dubey,et al. Distributed Deep Learning Using Synchronous Stochastic Gradient Descent , 2016, ArXiv.
[28] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[29] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[31] Ngoc Thang Vu,et al. Densely Connected Convolutional Networks for Speech Recognition , 2018, ITG Symposium on Speech Communication.