Capturing Label Characteristics in VAEs

We present a principled approach to incorporating labels in variational autoencoders (VAEs) that captures the rich characteristic information associated with those labels. While prior work has typically conflated these by learning latent variables that directly correspond to label values, we argue this is contrary to the intended effect of supervision in VAEs—capturing rich label characteristics with the latents. For example, we may want to capture the characteristics of a face that make it look young, rather than just the age of the person. To this end, we develop the characteristic capturing VAE (CCVAE), a novel VAE model and concomitant variational objective which captures label characteristics explicitly in the latent space, eschewing direct correspondences between label values and latents. Through judicious structuring of mappings between such characteristic latents and labels, we show that the CCVAE can effectively learn meaningful representations of the characteristics of interest across a variety of supervision schemes. In particular, we show that the CCVAE allows for more effective and more general interventions to be performed, such as smooth traversals within the characteristics for a given label, diverse conditional generation, and transferring characteristics across datapoints.

[1]  Masahiro Suzuki,et al.  Joint Multimodal Learning with Deep Generative Models , 2016, ICLR.

[2]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[3]  Yarin Gal,et al.  Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[4]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[5]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[6]  Emily B. Fox,et al.  Interpretable VAEs for nonlinear group factor analysis , 2018, ICML 2018.

[7]  Quan Pan,et al.  Disentangled Variational Auto-Encoder for Semi-supervised Learning , 2017, Inf. Sci..

[8]  Ole Winther,et al.  BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling , 2019, NeurIPS.

[9]  Jinwen Ma,et al.  DNA-GAN: Learning Disentangled Representations from Multi-Attribute Images , 2017, ICLR.

[10]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[11]  Jakub M. Tomczak,et al.  DIVA: Domain Invariant Variational Autoencoders , 2019, DGS@ICLR.

[12]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[13]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[14]  Jinwen Ma,et al.  ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes , 2018, ECCV.

[15]  Chuang Gan,et al.  The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.

[16]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[17]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Mike Wu,et al.  Multimodal Generative Models for Scalable Weakly-Supervised Learning , 2018, NeurIPS.

[19]  Joshua B. Tenenbaum,et al.  Mapping a Manifold of Perceptual Observations , 1997, NIPS.

[20]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[22]  Frank D. Wood,et al.  Learning Disentangled Representations with Semi-Supervised Deep Generative Models , 2017, NIPS.

[23]  Philip H. S. Torr,et al.  Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models , 2019, NeurIPS.

[24]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[25]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[26]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[27]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[28]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[29]  Ole Winther,et al.  Semi-Supervised Generation with Cluster-aware Generative Models , 2017, ArXiv.

[30]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[31]  Stefano Ermon,et al.  Learning Hierarchical Features from Deep Generative Models , 2017, ICML.

[32]  Kevin Murphy,et al.  Generative Models of Visually Grounded Imagination , 2017, ICLR.

[33]  Emilien Dupont,et al.  Joint-VAE: Learning Disentangled Joint Continuous and Discrete Representations , 2018, NeurIPS.

[34]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[35]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[36]  Tommi S. Jaakkola,et al.  Sequence to Better Sequence: Continuous Revision of Combinatorial Structures , 2017, ICML.

[37]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[38]  Zoubin Ghahramani,et al.  Discovering Interpretable Representations for Both Deep Generative and Discriminative Models , 2018, ICML.

[39]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[40]  Yee Whye Teh,et al.  Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.