LatentVis: Investigating and Comparing Variational Auto-Encoders via Their Latent Space

As the result of compression and the source of reconstruction, the latent space of Variational Auto-Encoders (VAEs) captures the essences of the training data and hence plays a fundamental role in data understanding and analysis. Focused on revealing what data features/semantics are encoded and how they are related in the latent space, this paper proposes a visual analytics system, i.e., LatentVis, to interactively study the latent space for better understanding and diagnosing image-based VAEs. Specifically, we train a supervised linear model to relate the machine-learned latents with the human-understandable semantics. With this model, each important data feature is expressed along a unique direction in the latent space (i.e., semantic direction). Comparing the semantic directions of different features allows us to compare the feature similarity encoded in the latent space, and thus to better understand the encoding process of the corresponding VAE. Moreover, LatentVis empowers us to examine and compare latent spaces across various training stages, or different VAE models, which can provide useful insight into model diagnosis.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Stefan Sommer,et al.  Latent Space Non-Linear Statistics , 2018, ArXiv.

[3]  Quoc V. Le,et al.  Intriguing Properties of Adversarial Examples , 2017, ICLR.

[4]  P. Thomas Fletcher,et al.  The Riemannian Geometry of Deep Generative Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Minsuk Kahng,et al.  ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models , 2017, IEEE Transactions on Visualization and Computer Graphics.

[6]  David Lopez-Paz,et al.  Optimizing the Latent Space of Generative Networks , 2017, ICML.

[7]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[9]  Zhen Li,et al.  Towards Better Analysis of Deep Convolutional Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[10]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[11]  Jitendra Malik,et al.  Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Wei Zhang,et al.  DeepVID: Deep Visual Interpretation and Diagnosis for Image Classifiers via Knowledge Distillation , 2019, IEEE Transactions on Visualization and Computer Graphics.

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[15]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[16]  Wei Zhang,et al.  SCANViz: Interpreting the Symbol-Concept Association Captured by Deep Neural Networks through Visual Analytics , 2020, 2020 IEEE Pacific Visualization Symposium (PacificVis).

[17]  Andrew W. Senior,et al.  Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Huamin Qu,et al.  RuleMatrix: Visualizing and Understanding Classifiers with Rules , 2018, IEEE Transactions on Visualization and Computer Graphics.

[20]  Valerio Pascucci,et al.  Visual Exploration of Semantic Relationships in Neural Word Embeddings , 2018, IEEE Transactions on Visualization and Computer Graphics.

[21]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[22]  Hao Yang,et al.  GANViz: A Visual Analytics Approach to Understand the Adversarial Game , 2018, IEEE Transactions on Visualization and Computer Graphics.

[23]  Yang Wang,et al.  Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models , 2018, IEEE Transactions on Visualization and Computer Graphics.

[24]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[25]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[26]  LinLin Shen,et al.  Deep Feature Consistent Variational Autoencoder , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[27]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Lars Kai Hansen,et al.  Latent Space Oddity: on the Curvature of Deep Generative Models , 2017, ICLR.

[29]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[30]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.