NeurIPS 2020 Competition: Predicting Generalization in Deep Learning

Understanding generalization in deep learning is arguably one of the most important questions in deep learning. Deep learning has been successfully adopted to a large number of problems ranging from pattern recognition to complex decision making, but many recent researchers have raised many concerns about deep learning, among which the most important is generalization. Despite numerous attempts, conventional statistical learning approaches have yet been able to provide a satisfactory explanation on why deep learning works. A recent line of works aims to address the problem by trying to predict the generalization performance through complexity measures. In this competition, we invite the community to propose complexity measures that can accurately predict generalization of models. A robust and general complexity measure would potentially lead to a better understanding of deep learning's underlying mechanism and behavior of deep models on unseen data, or shed light on better generalization bounds. All these outcomes will be important for making deep learning more robust and reliable.

[1]  Gintare Karolina Dziugaite,et al.  Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates , 2019, NeurIPS.

[2]  Gintare Karolina Dziugaite,et al.  Revisiting generalization for deep learning: PAC-Bayes, flat minima, and generative models , 2019 .

[3]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[4]  Nathan Srebro,et al.  Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.

[5]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[6]  C. V. Jawahar,et al.  Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Daniel M. Roy,et al.  Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes , 2019, NeurIPS.

[8]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[9]  Hossein Mobahi,et al.  Self-Distillation Amplifies Regularization in Hilbert Space , 2020, NeurIPS.

[10]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[11]  Yi Zhang,et al.  Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.

[12]  Ryota Tomioka,et al.  Norm-Based Capacity Control in Neural Networks , 2015, COLT.

[13]  Hossein Mobahi,et al.  Predicting the Generalization Gap in Deep Networks with Margin Distributions , 2018, ICLR.

[14]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Training Pruned Neural Networks , 2018, ArXiv.

[15]  Xingyou Song,et al.  Observational Overfitting in Reinforcement Learning , 2019, ICLR.

[16]  Gintare Karolina Dziugaite,et al.  Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.

[17]  Behnam Neyshabur,et al.  Implicit Regularization in Deep Learning , 2017, ArXiv.

[18]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[19]  Yann LeCun,et al.  Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks , 2018, ArXiv.

[20]  Gintare Karolina Dziugaite,et al.  Entropy-SGD optimizes the prior of a PAC-Bayes bound: Data-dependent PAC-Bayes priors via differential privacy , 2017, NeurIPS.

[21]  Behnam Neyshabur,et al.  The intriguing role of module criticality in the generalization of deep networks , 2020, ICLR.

[22]  David A. McAllester,et al.  A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.

[23]  Nathan Srebro,et al.  Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).

[24]  Hossein Mobahi,et al.  Large Margin Deep Networks for Classification , 2018, NeurIPS.

[25]  Mingjie Sun,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[26]  Amos J. Storkey,et al.  School of Informatics, University of Edinburgh , 2022 .

[27]  Nathan Srebro,et al.  The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..

[28]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[29]  Hossein Mobahi,et al.  Fantastic Generalization Measures and Where to Find Them , 2019, ICLR.

[30]  Ruslan Salakhutdinov,et al.  Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.

[31]  Ryota Tomioka,et al.  In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.

[32]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[33]  Hanna Mazzawi,et al.  Towards Task and Architecture-Independent Generalization Gap Predictors , 2019, ArXiv.

[34]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[35]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[36]  Nathan Srebro,et al.  Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.