Momentum Batch Normalization for Deep Learning with Small Batch Size
暂无分享,去创建一个
Lei Zhang | Deyu Meng | Xian-Sheng Hua | Hongwei Yong | Jianqiang Huang | Xiansheng Hua | Deyu Meng | Hongwei Yong | Lei Zhang | Jianqiang Huang
[1] Lei Huang,et al. Iterative Normalization: Beyond Standardization Towards Efficient Whitening , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Ruimao Zhang,et al. SSN: Learning Sparse Switchable Normalization via SparsestMax , 2019, International Journal of Computer Vision.
[3] Ping Luo,et al. Towards Understanding Regularization in Batch Normalization , 2018, ICLR.
[4] Kaiming He,et al. Group Normalization , 2018, International Journal of Computer Vision.
[5] Yuan Xie,et al. $L1$ -Norm Batch Normalization for Efficient Training of Deep Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[6] Boris Flach,et al. Stochastic Normalizations as Bayesian Learning , 2018, ACCV.
[7] Carla P. Gomes,et al. Understanding Batch Normalization , 2018, NeurIPS.
[8] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NIPS 2018.
[9] Qingyao Wu,et al. Double Forward Propagation for Memorized Batch Normalization , 2018, AAAI.
[10] Lei Huang,et al. Decorrelated Batch Normalization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Minhyung Cho,et al. Riemannian approach to batch normalization , 2017, NIPS.
[12] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[13] Sergey Ioffe,et al. Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models , 2017, NIPS.
[14] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Lei Zhang,et al. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.
[16] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.
[17] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[18] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[19] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[22] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[23] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[26] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[27] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[28] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[29] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[30] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Guozhong An,et al. The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.
[32] Christopher M. Bishop,et al. Current address: Microsoft Research, , 2022 .
[33] L. Bottou. Stochastic Gradient Learning in Neural Networks , 1991 .
[34] Russell V. Lenth,et al. Cumulative Distribution Function of the Noncentral T Distribution , 1989 .
[35] L. Crocker,et al. Introduction to Classical and Modern Test Theory , 1986 .