Entropy and mutual information in models of deep neural networks
暂无分享,去创建一个
Nicolas Macris | Florent Krzakala | Lenka Zdeborová | Jean Barbier | Andre Manoel | Marylou Gabrié | Clément Luneau | F. Krzakala | L. Zdeborová | N. Macris | Jean Barbier | Andre Manoel | Marylou Gabrié | Clément Luneau | Florent Krzakala
[1] Nicolas Macris,et al. The stochastic interpolation method: A simple scheme to prove replica formulas in Bayesian inference , 2017, ArXiv.
[2] Sundeep Rangan,et al. Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing , 2009, NIPS.
[3] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[4] G. Parisi,et al. Mean-field equations for spin models with orthogonal interaction matrices , 1995, cond-mat/9503009.
[5] Antonia Maria Tulino,et al. Random Matrix Theory and Wireless Communications , 2004, Found. Trends Commun. Inf. Theory.
[6] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.
[7] E. Gardner,et al. Optimal storage properties of neural network models , 1988 .
[8] Ralf R. Müller,et al. Vector Precoding for Wireless MIMO Systems and its Replica Analysis , 2007, IEEE Journal on Selected Areas in Communications.
[9] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[10] Y. Kabashima,et al. Learning from correlated patterns by simple perceptrons , 2008, 0809.1978.
[11] Surya Ganguli,et al. On the Expressive Power of Deep Neural Networks , 2016, ICML.
[12] J. S. Rowlinson,et al. PHASE TRANSITIONS , 2021, Topics in Statistical Mechanics.
[13] Yoshiyuki Kabashima,et al. Inference from correlated patterns: a unified theory for perceptron learning and linear vector channels , 2007, ArXiv.
[14] Toshiyuki Tanaka,et al. A statistical-mechanics approach to large-system analysis of CDMA multiuser detectors , 2002, IEEE Trans. Inf. Theory.
[15] Nicolas Macris,et al. The Mutual Information in Random Linear Estimation Beyond i.i.d. Matrices , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).
[16] David D. Cox,et al. On the information bottleneck theory of deep learning , 2018, ICLR.
[17] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[18] N. Macris,et al. The adaptive interpolation method: a simple scheme to prove replica formulas in Bayesian inference , 2018, Probability Theory and Related Fields.
[19] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[20] Misha Denil,et al. ACDC: A Structured Efficient Linear Layer , 2015, ICLR.
[21] Gal Chechik,et al. Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..
[22] Nicolas Macris,et al. Optimal errors and phase transitions in high-dimensional generalized linear models , 2017, Proceedings of the National Academy of Sciences.
[23] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[24] Andrea Montanari,et al. Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.
[25] Le Song,et al. Deep Fried Convnets , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[26] Sundeep Rangan,et al. Inference in Deep Networks in High Dimensions , 2017, 2018 IEEE International Symposium on Information Theory (ISIT).
[27] Mikko Vehkaperä,et al. Signal recovery using expectation consistent approximation for linear observations , 2014, 2014 IEEE International Symposium on Information Theory.
[28] Andrea Montanari,et al. High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, Probability Theory and Related Fields.
[29] Koujin Takeda,et al. Analysis of CDMA systems that are characterized by eigenvalue spectrum , 2006, ArXiv.
[30] Marc Lelarge,et al. Fundamental limits of symmetric low-rank matrix estimation , 2016, Probability Theory and Related Fields.
[31] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[32] G. Parisi,et al. Replica field theory for deterministic models: II. A non-random spin glass with glassy behaviour , 1994, cond-mat/9406074.
[33] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[34] Florent Krzakala,et al. Multi-layer generalized linear estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[35] D. Panchenko. The Sherrington-Kirkpatrick Model , 2013 .
[36] E. Gardner. The space of interactions in neural network models , 1988 .
[37] Sundeep Rangan,et al. Vector approximate message passing , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[38] M. Opper,et al. Tractable approximations for probabilistic models: the adaptive Thouless-Anderson-Palmer mean field approach. , 2001, Physical review letters.
[39] M. Opper,et al. Advanced mean field methods: theory and practice , 2001 .
[40] Surya Ganguli,et al. Deep Information Propagation , 2016, ICLR.
[41] Florent Krzakala,et al. Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.
[42] S. Kak. Information, physics, and computation , 1996 .
[43] Nicolas Macris,et al. Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models , 2017, ArXiv.
[44] D S Dean,et al. Role of the interaction matrix in mean-field spin glass models. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[45] Artemy Kolchinsky,et al. Estimating Mixture Entropy with Pairwise Distances , 2017, Entropy.
[46] Nicolas Macris,et al. The mutual information in random linear estimation , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[47] Aaron C. Courville,et al. MINE: Mutual Information Neural Estimation , 2018, ArXiv.
[48] M. Talagrand. Spin glasses : a challenge for mathematicians : cavity and mean field models , 2003 .
[49] Galen Reeves,et al. The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[50] Sompolinsky,et al. Storing infinite numbers of patterns in a spin-glass model of neural networks. , 1985, Physical review letters.
[51] Y. Kabashima,et al. Perceptron capacity revisited: classification ability for correlated patterns , 2007, 0712.4050.
[52] Jeffrey Pennington,et al. Nonlinear random matrix theory for deep learning , 2019, NIPS.
[53] Guillermo Sapiro,et al. Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy? , 2015, IEEE Transactions on Signal Processing.
[54] Florent Krzakala,et al. Statistical physics-based reconstruction in compressed sensing , 2011, ArXiv.
[55] Stefano Soatto,et al. Emergence of invariance and disentangling in deep representations , 2017 .
[56] David H. Wolpert,et al. Nonlinear Information Bottleneck , 2017, Entropy.
[57] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[58] Sundeep Rangan,et al. Generalized approximate message passing for estimation with random linear mixing , 2010, 2011 IEEE International Symposium on Information Theory Proceedings.
[59] 西森 秀稔. Statistical physics of spin glasses and information processing : an introduction , 2001 .
[60] Stefano Soatto,et al. Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[61] S. Kirkpatrick,et al. Solvable Model of a Spin-Glass , 1975 .
[62] M. Mézard. The space of interactions in neural networks: Gardner's computation with the cavity method , 1989 .
[63] Stefano Ermon,et al. InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.
[64] E. Gardner,et al. Three unfinished works on the optimal storage capacity of networks , 1989 .
[65] Shlomo Shamai,et al. Support Recovery With Sparsely Sampled Free Random Matrices , 2011, IEEE Transactions on Information Theory.
[66] Olivier Marre,et al. Relevant sparse codes with variational information bottleneck , 2016, NIPS.
[67] Romain Couillet,et al. Harnessing neural networks: A random matrix approach , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[68] G. Parisi,et al. Replica field theory for deterministic models: I. Binary sequences with low autocorrelation , 1994, hep-th/9405148.
[69] Nicolas Macris,et al. The layered structure of tensor estimation and its mutual information , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[70] Nicolas Macris,et al. Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.
[71] Surya Ganguli,et al. Statistical mechanics of compressed sensing. , 2010, Physical review letters.
[72] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.
[73] Gábor Lugosi,et al. Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.
[74] Galen Reeves. Additivity of information in multilayer networks via additive Gaussian noise transforms , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[75] M. Mézard,et al. Spin Glass Theory and Beyond , 1987 .
[76] Yoshiyuki Kabashima,et al. Erratum: A typical reconstruction limit of compressed sensing based on Lp-norm minimization , 2009, ArXiv.