The Variational Homoencoder: Learning to learn high capacity generative models from few examples

Hierarchical Bayesian methods can unify many related tasks (e.g. k-shot classification, conditional and unconditional generation) as inference within a single generative model. However, when this generative model is expressed as a powerful neural network such as a PixelCNN, we show that existing learning techniques typically fail to effectively use latent variables. To address this, we develop a modification of the Variational Autoencoder in which encoded observations are decoded to new elements from the same class. This technique, which we call a Variational Homoencoder (VHE), produces a hierarchical latent variable model which better utilises latent variables. We use the VHE framework to learn a hierarchical PixelCNN on the Omniglot dataset, which outperforms all existing models on test set likelihood and achieves strong performance on one-shot generation and classification tasks. We additionally validate the VHE on natural images from the YouTube Faces database. Finally, we develop extensions of the model that apply to richer dataset structures such as factorial and hierarchical categories.

[1]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[2]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[3]  Raquel Urtasun,et al.  Few-Shot Learning Through an Information Retrieval Lens , 2017, NIPS.

[4]  Thomas Paine,et al.  Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions , 2017, ICLR.

[5]  Daan Wierstra,et al.  One-Shot Generalization in Deep Generative Models , 2016, ICML.

[6]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[7]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[8]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[9]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[10]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[11]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[12]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[13]  Dmitry P. Vetrov,et al.  Fast Adaptation in Generative Models with Generative Matching Networks , 2016, ICLR.

[14]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[15]  Aurko Roy,et al.  Learning to Remember Rare Events , 2017, ICLR.

[16]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[17]  Daan Wierstra,et al.  Towards Conceptual Compression , 2016, NIPS.

[18]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[19]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[20]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[21]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[22]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .