Tackling Over-pruning in Variational Autoencoders

Variational autoencoders (VAE) are directed generative models that learn factorial latent variables. As noted by Burda et al. (2015), these models exhibit the problem of factor over-pruning where a significant number of stochastic factors fail to learn anything and become inactive. This can limit their modeling power and their ability to learn diverse and meaningful latent representations. In this paper, we evaluate several methods to address this problem and propose a more effective model-based approach called the epitomic variational autoencoder (eVAE). The so-called epitomes of this model are groups of mutually exclusive latent factors that compete to explain the data. This approach helps prevent inactive units since each group is pressured to explain the data. We compare the approaches with qualitative and quantitative results on MNIST and TFD datasets. Our results show that eVAE makes efficient use of model capacity and generalizes better than VAE.

[1]  Yann LeCun,et al.  Structured sparse coding via lateral inhibition , 2011, NIPS.

[2]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[3]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[4]  Yoshua Bengio,et al.  Better Mixing via Deep Representations , 2012, ICML.

[5]  Geoffrey E. Hinton,et al.  Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.

[6]  Ole Winther,et al.  How to Train Deep Variational Autoencoders and Probabilistic Ladder Networks , 2016, ICML 2016.

[7]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[8]  Yoshua Bengio,et al.  A Generative Process for sampling Contractive Auto-Encoders , 2012, ICML 2012.

[9]  Jiajun Wu,et al.  Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.

[10]  Yoshua Bengio,et al.  Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.

[11]  Brendan J. Frey,et al.  Epitomic analysis of appearance and shape , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[13]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[14]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[21]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[22]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[23]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[24]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.