Information-Theoretic Methods in Deep Neural Networks: Recent Advances and Emerging Opportunities
暂无分享,去创建一个
José Carlos Príncipe | Luis Gonzalo Sánchez Giraldo | Shujian Yu | J. Príncipe | Shujian Yu | L. S. Giraldo
[1] Gunnar Rätsch,et al. Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection , 2004, J. Mach. Learn. Res..
[2] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[3] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[4] V. Moskvina,et al. An Algorithm Based on Singular Spectrum Analysis for Change-Point Detection , 2003 .
[5] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.
[6] Robert Jenssen,et al. Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional , 2020, IEEE transactions on pattern analysis and machine intelligence.
[7] Zhaohui Wu,et al. Robust feature learning by stacked autoencoder with maximum correntropy criterion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[9] Michael Satosi Watanabe,et al. Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..
[10] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[11] Ziv Goldfeld,et al. The Information Bottleneck Problem and its Applications in Machine Learning , 2020, IEEE Journal on Selected Areas in Information Theory.
[12] José Carlos Príncipe,et al. Balancing exploration and exploitation in reinforcement learning using a value of information criterion , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] José Carlos Príncipe,et al. Understanding Autoencoders with Information Theoretic Concepts , 2018, Neural Networks.
[14] Takafumi Kanamori,et al. Mutual information estimation reveals global associations between stimuli and biological processes , 2009, BMC Bioinformatics.
[15] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .
[16] Thomas Steinke,et al. Reasoning About Generalization via Conditional Mutual Information , 2020, COLT.
[17] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[18] Stefano Ermon,et al. A Theory of Usable Information Under Computational Constraints , 2020, ICLR.
[19] Marissa Connor,et al. Generative causal explanations of black-box classifiers , 2020, NeurIPS.
[20] Yizhou Wang,et al. L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise , 2019, NeurIPS.
[21] Edwin R. Hancock,et al. Generative Graph Prototypes from Information Theory , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[22] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[23] Wojciech Czarnecki,et al. On Loss Functions for Deep Neural Networks in Classification , 2017, ArXiv.
[24] Hossein Mobahi,et al. Predicting the Generalization Gap in Deep Networks with Margin Distributions , 2018, ICLR.
[25] José Carlos Príncipe,et al. Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications , 2020, IJCAI.
[26] Daniel J. Velleman. American Mathematical Monthly , 2010 .
[27] Mehryar Mohri,et al. Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..
[28] Barnabás Póczos,et al. Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs , 2010, NIPS.
[29] Yingsong Li,et al. Maximum Correntropy Criterion With Variable Center , 2019, IEEE Signal Processing Letters.
[30] Yochai Blau,et al. Direct Validation of the Information Bottleneck Principle for Deep Nets , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[31] Masashi Sugiyama,et al. Least-squares independence regression for non-linear causal inference under non-Gaussian noise , 2011, Machine Learning.
[32] Cian O'Donnell,et al. Adaptive Estimators Show Information Compression in Deep Neural Networks , 2019, ICLR.
[33] Le Song,et al. Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.
[34] Yoshua Bengio,et al. Towards Causal Representation Learning , 2021, ArXiv.
[35] Shlomo Shamai,et al. On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views , 2020, Entropy.
[36] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[37] Rana Ali Amjad,et al. Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Jagat Narain Kapur,et al. Measures of information and their applications , 1994 .
[39] Karl Stratos,et al. Formal Limitations on the Measurement of Mutual Information , 2018, AISTATS.
[40] Deniz Erdogmus,et al. An error-entropy minimization algorithm for supervised training of nonlinear adaptive systems , 2002, IEEE Trans. Signal Process..
[41] Michael Tschannen,et al. On Mutual Information Maximization for Representation Learning , 2019, ICLR.
[42] Thierry Paul,et al. Quantum computation and quantum information , 2007, Mathematical Structures in Computer Science.
[43] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[44] I. Stravinsky,et al. Gestural Control of Real-Time Concatenative Synthesis in Luna Park Grégory Beller Computer Music , 2011 .
[45] Shaofeng Zou,et al. Tightening Mutual Information Based Bounds on Generalization Error , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).
[46] Yunmei Chen,et al. On Kernel Method–Based Connectionist Models and Supervised Deep Learning Without Backpropagation , 2020, Neural Computation.
[47] José Carlos Príncipe,et al. Training a Bank of Wiener Models with a Novel Quadratic Mutual Information Cost Function , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Le Song,et al. A Kernel Statistical Test of Independence , 2007, NIPS.
[49] Aram Galstyan,et al. Discovering Structure in High-Dimensional Data Through Correlation Explanation , 2014, NIPS.
[50] Jose C. Principe,et al. Information Theoretic Learning - Renyi's Entropy and Kernel Perspectives , 2010, Information Theoretic Learning.
[51] Jose C. Principe,et al. Measures of Entropy From Data Using Infinitely Divisible Kernels , 2012, IEEE Transactions on Information Theory.
[52] Weifeng Liu,et al. Correntropy: Properties and Applications in Non-Gaussian Signal Processing , 2007, IEEE Transactions on Signal Processing.
[53] Robert Jenssen,et al. Measuring Dependence with Matrix-based Entropy Functional , 2021, AAAI.
[54] Rob Brekelmans,et al. Auto-Encoding Total Correlation Explanation , 2018, AISTATS.
[55] Pramod Viswanath,et al. Demystifying fixed k-nearest neighbor information estimators , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).
[56] Jose C. Principe,et al. Deep Deterministic Information Bottleneck with Matrix-Based Entropy Functional , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[57] Stefano Soatto,et al. Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[58] Constantine Kotropoulos,et al. Robust Multidimensional Scaling Using a Maximum Correntropy Criterion , 2017, IEEE Transactions on Signal Processing.
[59] Maxim Raginsky,et al. Information-theoretic analysis of generalization capability of learning algorithms , 2017, NIPS.
[60] Bernhard Schölkopf,et al. Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..
[61] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[62] L. Pardo. Statistical Inference Based on Divergence Measures , 2005 .
[63] Robert Jenssen,et al. The Cauchy-Schwarz divergence and Parzen windowing: Connections to graph theory and Mercer kernels , 2006, J. Frankl. Inst..
[64] Yonatan Belinkov,et al. Variational Information Bottleneck for Effective Low-Resource Fine-Tuning , 2021, ICLR.
[65] Barnabás Póczos,et al. On the Estimation of alpha-Divergences , 2011, AISTATS.
[66] Uri Shalit,et al. Robust learning with the Hilbert-Schmidt independence criterion , 2019, ICML.
[67] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[68] Jose C. Principe,et al. Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications , 2020, IJCAI.
[69] Chelsea Finn,et al. Meta-Learning without Memorization , 2020, ICLR.
[70] Alfred O. Hero,et al. Scalable Mutual Information Estimation Using Dependence Graphs , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[71] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[72] Bernhard Schölkopf,et al. Regression by dependence minimization and its application to causal inference in additive noise models , 2009, ICML '09.
[73] G. Crooks. On Measures of Entropy and Information , 2015 .
[74] Leonardo Rey Vega,et al. Compression-Based Regularization With an Application to Multitask Learning , 2017, IEEE Journal of Selected Topics in Signal Processing.
[75] Bernhard C. Geiger,et al. Understanding Individual Neuron Importance Using Information Theory , 2018, ArXiv.
[76] Germán Castellanos-Domínguez,et al. Relevant information undersampling to support imbalanced data classification , 2021, Neurocomputing.
[77] Wei Wu,et al. Explaining a black-box using Deep Variational Information Bottleneck Approach , 2019, ArXiv.
[78] Fengmao Lv,et al. Can Cross Entropy Loss Be Robust to Label Noise? , 2020, IJCAI.
[79] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..
[80] David H. Wolpert,et al. Nonlinear Information Bottleneck , 2017, Entropy.
[81] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[82] Sergey Levine,et al. Wasserstein Dependency Measure for Representation Learning , 2019, NeurIPS.
[83] W. Marsden. I and J , 2012 .
[84] Yu Cheng,et al. InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective , 2020, ArXiv.
[85] Rishabh Singh,et al. Time Series Analysis using a Kernel based Multi-Modal Uncertainty Decomposition Framework , 2020, UAI.
[86] Badong Chen,et al. Insights Into the Robustness of Minimum Error Entropy Estimation , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[87] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[88] Jure Leskovec,et al. Graph Information Bottleneck , 2020, NeurIPS.
[89] Nicolas Le Roux,et al. Understanding the impact of entropy on policy optimization , 2018, ICML.
[90] Nicky J Welton,et al. Value of Information , 2015, Medical decision making : an international journal of the Society for Medical Decision Making.
[91] R. J. Joenk,et al. IBM journal of research and development: information for authors , 1978 .
[92] Barnabás Póczos,et al. Generalized Exponential Concentration Inequality for Renyi Divergence Estimation , 2014, ICML.
[93] Yantao Wei,et al. Multiscale principle of relevant information for hyperspectral image classification , 2019, Machine Learning.
[94] Lantao Yu,et al. Training Deep Energy-Based Models with f-Divergence Minimization , 2020, ICML.