Gradient Amplification: An efficient way to train deep neural networks
暂无分享,去创建一个
Yi Pan | Sunitha Basodi | Chunyan Ji | Haiping Zhang | Yi Pan | S. Basodi | Chunyan Ji | Haiping Zhang
[1] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[2] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[3] Md. Zakirul Alam Bhuiyan,et al. A Survey on Deep Learning in Big Data , 2017, 22017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC).
[4] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..
[5] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[6] emontmej,et al. High Performance Computing , 2003, Lecture Notes in Computer Science.
[7] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[8] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[9] Shuai Wang,et al. Deep learning for sentiment analysis: A survey , 2018, WIREs Data Mining Knowl. Discov..
[10] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[11] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Boris Hanin,et al. Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients? , 2018, NeurIPS.
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[16] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[19] Andreas Kamilaris,et al. Deep learning in agriculture: A survey , 2018, Comput. Electron. Agric..
[20] Abhinav Vishnu,et al. Deep learning for computational chemistry , 2017, J. Comput. Chem..
[21] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[22] John Moody,et al. Learning rate schedules for faster stochastic gradient search , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.
[23] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[24] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[25] Louis B. Rall,et al. Automatic differentiation , 1981 .
[26] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.
[27] Xiaohui Peng,et al. Deep Learning for Sensor-based Activity Recognition: A Survey , 2017, Pattern Recognit. Lett..