Variational Hyper-encoding Networks

We propose a framework called HyperVAE for encoding distributions of distributions. When a target distribution is modeled by a VAE, its neural network parameters are sampled from a distribution in the model space modeled by a hyper-level VAE. We propose a variational inference framework to implicitly encode the parameter distributions into a low dimensional Gaussian distribution. Given a target distribution, we predict the posterior distribution of the latent code, then use a matrixnetwork decoder to generate a posterior distribution for the parameters. HyperVAE can encode the target parameters in full in contrast to common hyper-networks practices, which generate only the scale and bias vectors to modify the target-network parameters. Thus HyperVAE preserves information about the model for each task in the latent space. We derive the training objective for HyperVAE using the minimum description length (MDL) principle to reduce the complexity of HyperVAE. We evaluate HyperVAE in density estimation tasks, outlier detection and discovery of novel design classes, demonstrating its efficacy.

[1]  Truyen Tran,et al.  Incomplete Conditional Density Estimation for Fast Materials Discovery , 2019, SDM.

[2]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[3]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[4]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[5]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[6]  Mike Wu,et al.  Meta-Amortized Variational Inference and Learning , 2019, AAAI.

[7]  Truyen Tran,et al.  Matrix-centric Neural Networks , 2017, ArXiv.

[8]  Svetha Venkatesh,et al.  Variational Memory Encoder-Decoder , 2018, NeurIPS.

[9]  Fuxin Li,et al.  HyperGAN: A Generative Model for Diverse, Performant Neural Networks , 2019, ICML.

[10]  S. Venkatesh,et al.  Learning Deep Matrix Representations , 2017 .

[11]  Yee Whye Teh,et al.  Continual Unsupervised Representation Learning , 2019, NeurIPS.

[12]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[13]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[14]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[15]  Sergey Levine,et al.  Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[16]  Thomas L. Griffiths,et al.  Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[17]  David Barber,et al.  Practical Lossless Compression with Latent Variables using Bits Back Coding , 2019, ICLR.

[18]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[19]  Yoshua Bengio,et al.  Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[20]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[21]  Richard S. Zemel,et al.  Adversarial Distillation of Bayesian Neural Network Posteriors , 2018, ICML.

[22]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.