Information Theory in Density Destructors

Density destructors are differentiable and invertible transforms that map multivariate PDFs of arbitrary structure (low entropy) into non-structured PDFs (maximum entropy). Multivariate Gaussianization and multivariate equalization are specific examples of this family, which break down the complexity of the original PDF through a set of elementary transforms that progressively remove the structure of the data. We demonstrate how this property of density destructive flows is connected to classical information theory, and how density destructors can be used to get more accurate estimates of information theoretic quantities. Experiments with total correlation and mutual information inmultivariate sets illustrate the ability of density destructors compared to competing methods. These results suggest that information theoretic measures may be an alternative optimization criteria when learning density destructive flows.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Zoltán Szabó,et al.  Information theoretical estimators toolbox , 2014, J. Mach. Learn. Res..

[4]  M. Studený,et al.  The Multiinformation Function as a Tool for Measuring Stochastic Dependence , 1998, Learning in Graphical Models.

[5]  Valero Laparra,et al.  Density Modeling of Images using a Generalized Normalization Transformation , 2015, ICLR.

[6]  José-Luis Guerrero-Cusumano,et al.  Measures of dependence for the multivariate t distribution with applications to the stock market , 1998 .

[7]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[8]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[9]  Nicholas M. Timme,et al.  A Tutorial for Information Theory in Neuroscience , 2018, eNeuro.

[10]  Valero Laparra,et al.  Iterative Gaussianization: From ICA to Random Rotations , 2011, IEEE Transactions on Neural Networks.

[11]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[12]  H Barlow,et al.  Redundancy reduction revisited , 2001, Network.

[13]  James M. Robins,et al.  Nonparametric von Mises Estimators for Entropies, Divergences and Mutual Informations , 2015, NIPS.

[14]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[15]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[16]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[17]  Pradeep Ravikumar,et al.  Deep Density Destructors , 2018, ICML.

[18]  M. N. Goria,et al.  A new class of random vector entropy estimators and its applications in testing statistical hypotheses , 2005 .

[19]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[21]  Jean-François Cardoso,et al.  Dependence, Correlation and Gaussianity in Independent Component Analysis , 2003, J. Mach. Learn. Res..

[22]  Frank Nielsen,et al.  Entropies and cross-entropies of exponential families , 2010, 2010 IEEE International Conference on Image Processing.

[23]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).