暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[2] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[3] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[4] Sami Abu-El-Haija,et al. Learning Edge Representations via Low-Rank Asymmetric Projections , 2017, CIKM.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[7] Patrick Seemann,et al. Matrix Factorization Techniques for Recommender Systems , 2014 .
[8] Yang You,et al. Scaling SGD Batch Size to 32K for ImageNet Training , 2017, ArXiv.
[9] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[10] Li Fei-Fei,et al. Detecting Events and Key Actors in Multi-person Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[12] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.