Shade: Information-Based Regularization for Deep Learning
暂无分享,去创建一个
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[3] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[4] Matthieu Cord,et al. Top-Down Regularization of Deep Belief Networks , 2013, NIPS.
[5] Benjamin Graham,et al. Fractional Max-Pooling , 2014, ArXiv.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Matthieu Cord,et al. WELDON: Weakly Supervised Learning of Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Matthieu Cord,et al. WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[10] Stefano Soatto,et al. Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).
[11] Liam Paninski,et al. Estimation of Entropy and Mutual Information , 2003, Neural Computation.
[12] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[13] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[14] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[15] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[16] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[17] Stefano Soatto,et al. Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Matthieu Cord,et al. Deep Neural Networks Under Stress , 2016, 2016 IEEE International Conference on Image Processing (ICIP).
[19] Maxim Raginsky,et al. Information-theoretic analysis of generalization capability of learning algorithms , 2017, NIPS.
[20] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..
[21] Ohad Shamir,et al. Learning and generalization with the information bottleneck , 2008, Theoretical Computer Science.
[22] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[23] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[24] Rob Fergus,et al. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.
[25] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[26] Ohad Shamir,et al. Learning and generalization with the information bottleneck , 2008, Theor. Comput. Sci..
[27] Matthieu Cord,et al. Max-min convolutional neural networks for image classification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).
[28] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[29] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.
[30] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.
[31] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[32] Bernhard C. Geiger,et al. How (Not) To Train Your Neural Network Using the Information Bottleneck Principle , 2018, ArXiv.
[33] Matthieu Cord,et al. Learning Deep Hierarchical Visual Feature Coding , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[34] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[35] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[36] Sepp Hochreiter,et al. Self-Normalizing Neural Networks , 2017, NIPS.
[37] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[38] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[39] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.
[40] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[41] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[42] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .