On the implicit minimization of alternative loss functions when training deep networks
暂无分享,去创建一个
[1] Hossein Mobahi,et al. Large Margin Deep Networks for Classification , 2018, NeurIPS.
[2] Yi Lin. A note on margin-based loss functions in classification , 2004 .
[3] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[4] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[5] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[6] Quoc V. Le,et al. A Bayesian Perspective on Generalization and Stochastic Gradient Descent , 2017, ICLR.
[7] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[8] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[9] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[10] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[11] Hossein Mobahi,et al. Predicting the Generalization Gap in Deep Networks with Margin Distributions , 2018, ICLR.
[12] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[13] Tomaso A. Poggio,et al. Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization , 2019, ArXiv.
[14] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[15] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[16] Yang You,et al. Scaling SGD Batch Size to 32K for ImageNet Training , 2017, ArXiv.
[17] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.
[18] Tomaso A. Poggio,et al. A Surprising Linear Relationship Predicts Test Performance in Deep Networks , 2018, ArXiv.
[19] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[20] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.