暂无分享,去创建一个
Antoni B. Chan | Yufei Cui | Ziquan Liu | Jia Wan | Yu Mao
[1] Carla P. Gomes,et al. Understanding Batch Normalization , 2018, NeurIPS.
[2] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[3] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[4] Haroon Idrees,et al. Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.
[5] Jian Yang,et al. Understanding the Disharmony between Weight Normalization Family and Weight Decay , 2020, AAAI.
[6] Antoni B. Chan,et al. Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations , 2020, ArXiv.
[7] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[8] Kaiming He,et al. Group Normalization , 2018, ECCV.
[9] S. Nadarajah. A generalized normal distribution , 2005 .
[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Boris Ginsburg,et al. Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification , 2017, ArXiv.
[12] Elad Hoffer,et al. Norm matters: efficient and accurate normalization schemes in deep networks , 2018, NeurIPS.
[13] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.
[14] Ruslan Salakhutdinov,et al. Data-Dependent Path Normalization in Neural Networks , 2015, ICLR.
[15] Lei Huang,et al. Projection Based Weight Normalization for Deep Neural Networks , 2017, ArXiv.
[16] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[17] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[18] Yuhong Li,et al. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.
[20] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[21] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[22] Wei Hu,et al. Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced , 2018, NeurIPS.
[23] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[24] Ping Luo,et al. Towards Understanding Regularization in Batch Normalization , 2018, ICLR.
[25] Vishnu Naresh Boddeti,et al. On the Intrinsic Dimensionality of Image Representations , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[27] Alessandro Laio,et al. Intrinsic dimension of data representations in deep neural networks , 2019, NeurIPS.
[28] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[29] Colin Wei,et al. Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks , 2019, NeurIPS.
[30] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[31] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Jascha Sohl-Dickstein,et al. A Mean Field Theory of Batch Normalization , 2019, ICLR.
[34] Guodong Zhang,et al. Three Mechanisms of Weight Decay Regularization , 2018, ICLR.
[35] Stefano Soatto,et al. Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence , 2019, NeurIPS.
[36] Twan van Laarhoven,et al. L2 Regularization versus Batch and Weight Normalization , 2017, ArXiv.
[37] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[38] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.
[39] Yihong Gong,et al. Bayesian Loss for Crowd Count Estimation With Point Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[40] Subhransu Maji,et al. Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.
[41] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[42] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[43] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.
[44] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[45] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[46] Ruslan Salakhutdinov,et al. Path-SGD: Path-Normalized Optimization in Deep Neural Networks , 2015, NIPS.
[47] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[48] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.