When Does Label Smoothing Help?
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[2] Eric B. Baum,et al. Supervised Learning of Probability Distributions by Neural Networks , 1987, NIPS.
[3] Esther Levin,et al. Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..
[4] Scott E. Fahlman,et al. An empirical study of learning speed in back-propagation networks , 1988 .
[5] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[6] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[8] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[9] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[10] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Qi Tian,et al. DisturbLabel: Regularizing CNN on the Loss Layer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[15] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[16] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[17] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[18] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.
[19] Marc'Aurelio Ranzato,et al. Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.
[20] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[21] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.
[22] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Sunita Sarawagi,et al. Calibration of Encoder Decoder Models for Neural Machine Translation , 2019, ArXiv.
[24] Quoc V. Le,et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.
[25] Cian O'Donnell,et al. Adaptive Estimators Show Information Compression in Deep Neural Networks , 2019, ICLR.