暂无分享,去创建一个
Frank Nielsen | Rob Brekelmans | Alireza Makhzani | Aram Galstyan | Greg Ver Steeg | Alireza Makhzani | A. Galstyan | F. Nielsen | G. V. Steeg | Rob Brekelmans
[1] Alexander A. Alemi,et al. TherML: Thermodynamics of Machine Learning , 2018, ArXiv.
[2] Frank D. Wood,et al. All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference , 2020, ICML.
[3] Ruslan Salakhutdinov,et al. Annealing between distributions by averaging moments , 2013, NIPS.
[4] Stefano Soatto,et al. Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Frank D. Wood,et al. The Thermodynamic Variational Objective , 2019, NeurIPS.
[6] Yansong Gao,et al. A Free-Energy Principle for Representation Learning , 2020, ICML.
[7] Polina Golland,et al. DEMI: Discriminative Estimator of Mutual Information , 2020, ArXiv.
[8] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[9] Nir Friedman,et al. The Information Bottleneck EM Algorithm , 2002, UAI.
[10] Lizhong Zheng,et al. I-Projection and the Geometry of Error Exponents , 2006 .
[11] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[12] J. Borwein,et al. Convex Analysis And Nonlinear Optimization , 2000 .
[13] Geoffrey C. Fox,et al. A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..
[14] Frank Nielsen,et al. On the Jensen–Shannon Symmetrization of Distances Relying on Abstract Means , 2019, Entropy.
[15] Xiao-Li Meng,et al. Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .
[16] Radford M. Neal. Annealed importance sampling , 1998, Stat. Comput..
[17] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[18] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[19] Frank Nielsen,et al. A family of statistical symmetric divergences based on Jensen's inequality , 2010, ArXiv.
[20] Makoto Yamada,et al. Neural Methods for Point-wise Dependency Estimation , 2020, NeurIPS.
[21] Y. Ogata. A Monte Carlo method for high dimensional integration , 1989 .
[22] Alireza Makhzani,et al. Evaluating Lossy Compression Rates of Deep Generative Models , 2020, ICML.
[23] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..
[24] C. Jarzynski. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach , 1997, cond-mat/9707325.
[25] Alexander A. Alemi,et al. On Variational Bounds of Mutual Information , 2019, ICML.
[26] Jorma Rissanen,et al. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[27] Naftali Tishby,et al. Multivariate Information Bottleneck , 2001, Neural Computation.
[28] Jacob Deasy,et al. Constraining Variational Inference with Geometric Jensen-Shannon Divergence , 2020, NeurIPS.
[29] Peter Harremoës,et al. Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.
[30] Kai Xu,et al. Telescoping Density-Ratio Estimation , 2020, NeurIPS.
[31] K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems , 1998, Proc. IEEE.
[32] Alexander A. Alemi,et al. Fixing a Broken ELBO , 2017, ICML.
[33] Peter Harremoes. Interpretations of Renyi Entropies And Divergences , 2005 .
[34] Frank Nielsen,et al. An Information-Geometric Characterization of Chernoff Information , 2013, IEEE Signal Processing Letters.
[35] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[36] Imre Csiszár,et al. Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.
[37] O. F. Cook. The Method of Types , 1898 .
[38] Frank Nielsen,et al. A closed-form expression for the Sharma–Mittal entropy of exponential families , 2011, ArXiv.