Information in Infinite Ensembles of Infinitely-Wide Neural Networks

In this preliminary work, we study the generalization properties of infinite ensembles of infinitely-wide neural networks. Amazingly, this model family admits tractable calculations for many information-theoretic quantities. We report analytical and empirical investigations in the search for signals that correlate with generalization.

[1]  Jaehoon Lee,et al.  Neural Tangents: Fast and Easy Infinite Neural Networks in Python , 2019, ICLR.

[2]  Ruosong Wang,et al.  Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks , 2019, ICLR.

[3]  Stefano Soatto,et al.  Where is the Information in a Deep Neural Network? , 2019, ArXiv.

[4]  Frederik Kunstner,et al.  Limitations of the Empirical Fisher Approximation , 2019, NeurIPS.

[5]  Alexander A. Alemi,et al.  On Variational Bounds of Mutual Information , 2019, ICML.

[6]  Jaehoon Lee,et al.  Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.

[7]  Artemy Kolchinsky,et al.  Caveats for information bottleneck in deterministic scenarios , 2018, ICLR.

[8]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[9]  Rana Ali Amjad,et al.  Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Bernhard C. Geiger,et al.  How (Not) To Train Your Neural Network Using the Information Bottleneck Principle , 2018, ArXiv.

[11]  David D. Cox,et al.  On the information bottleneck theory of deep learning , 2018, ICLR.

[12]  Raef Bassily,et al.  Learners that Use Little Information , 2017, ALT.

[13]  Stefano Soatto,et al.  Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[14]  Stefano Soatto,et al.  Emergence of invariance and disentangling in deep representations , 2017 .

[15]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[16]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[17]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[18]  Arindam Banerjee,et al.  On Bayesian bounds , 2006, ICML.

[19]  Gal Chechik,et al.  Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .