论文信息 - Understanding Posterior Collapse in Generative Latent Variable Models - 字舞流文

Understanding Posterior Collapse in Generative Latent Variable Models

Posterior collapse in Variational Autoencoders (VAEs) arises when the variational distribution closely matches the uninformative prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We identify how local maxima can emerge from the marginal log-likelihood of pPCA, which yields similar local maxima for the evidence lower bound (ELBO). We show that training a linear VAE with variational inference recovers a uniquely identifiable global maximum corresponding to the principal component directions. We provide empirical evidence that the presence of local maxima causes posterior collapse in non-linear VAEs. Our findings help to explain a wide range of heuristic approaches in the literature that attempt to diminish the effect of the KL term in the ELBO to alleviate posterior collapse.

Mohammad Norouzi | James Lucas | George Tucker | Roger Grosse | G. Tucker | Mohammad Norouzi | James Lucas | R. Grosse

[1] David Duvenaud,et al. Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[2] Ole Winther,et al. Ladder Variational Autoencoders , 2016, NIPS.

[3] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[4] Alexandre Lacoste,et al. Improving Explorability in Variational Inference with Annealed Variational Objectives , 2018, NeurIPS.

[5] Max Welling,et al. Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[6] Alexander A. Alemi,et al. Fixing a Broken ELBO , 2017, ICML.

[7] Pieter Abbeel,et al. Variational Lossy Autoencoder , 2016, ICLR.

[8] Alán Aspuru-Guzik,et al. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[9] J. Atchison,et al. Logistic-normal distributions:Some properties and uses , 1980 .

[10] David M. Blei,et al. Variational Inference: A Review for Statisticians , 2016, ArXiv.

[11] David P. Wipf,et al. Diagnosing and Enhancing VAE Models , 2019, ICLR.

[12] Ole Winther,et al. BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling , 2019, NeurIPS.

[13] Gal Chechik,et al. Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..

[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.

[16] Bin Dai,et al. Hidden Talents of the Variational Autoencoder. , 2017 .

[17] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.

[18] Ali Razavi,et al. Preventing Posterior Collapse with delta-VAEs , 2019, ICLR.

[19] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .

[20] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[21] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[22] Graham Neubig,et al. Lagging Inference Networks and Posterior Collapse in Variational Autoencoders , 2019, ICLR.

[23] Alexander M. Rush,et al. Semi-Amortized Variational Autoencoders , 2018, ICML.

[24] Nebojsa Jojic,et al. Iterative Refinement of the Approximate Posterior for Directed Belief Networks , 2015, NIPS.

[25] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[26] Roger B. Grosse,et al. Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[27] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[28] Daniel Kunin,et al. Loss Landscapes of Regularized Linear Autoencoders , 2019, ICML.

[29] Georg Martius,et al. Variational Autoencoders Pursue PCA Directions (by Accident) , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.