Associative Compression Networks for Representation Learning

This paper introduces Associative Compression Networks (ACNs), a new framework for variational autoencoding with neural networks. The system differs from existing variational autoencoders (VAEs) in that the prior distribution used to model each code is conditioned on a similar code from the dataset. In compression terms this equates to sequentially transmitting the dataset using an ordering determined by proximity in latent space. Since the prior need only account for local, rather than global variations in the latent space, the coding cost is greatly reduced, leading to rich, informative codes. Crucially, the codes remain informative when powerful, autoregressive decoders are used, which we argue is fundamentally difficult with normal VAEs. Experimental results on MNIST, CIFAR-10, ImageNet and CelebA show that ACNs discover high-level latent features such as object class, writing style, pose and facial expression, which can be used to cluster and classify the data, as well as to generate diverse and convincing samples. We conclude that ACNs are a promising new direction for representation learning: one that steps away from IID modelling, and towards learning a structured description of the dataset as a whole.

[1]  Daan Wierstra,et al.  Towards Conceptual Compression , 2016, NIPS.

[2]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[4]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[5]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[6]  Philip Bachman,et al.  An Architecture for Deep, Hierarchical Generative Models , 2016, NIPS.

[7]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[8]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[9]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[10]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[12]  Nitish Srivastava Unsupervised Learning of Visual Representations using Videos , 2015 .

[13]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[14]  Martin Wattenberg,et al.  Embedding Projector: Interactive Visualization and Interpretation of Embeddings , 2016, ArXiv.

[15]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[16]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[17]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[18]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[19]  Pieter Abbeel,et al.  PixelSNAIL: An Improved Autoregressive Generative Model , 2017, ICML.

[20]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[21]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[22]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[23]  Jason Tyler Rolfe,et al.  Discrete Variational Autoencoders , 2016, ICLR.

[24]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[25]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[26]  Tor Lattimore,et al.  Online Learning with Gated Linear Networks , 2017, ArXiv.

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[29]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[30]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[31]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[32]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[33]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[34]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[35]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.