Recurrent Variational Autoencoders for Learning Nonlinear Generative Models in the Presence of Outliers

This paper explores two useful modifications of the recent variational autoencoder (VAE), a popular deep generative modeling framework that dresses traditional autoencoders with probabilistic attire. The first involves a specially-tailored form of conditioning that allows us to simplify the VAE decoder structure while simultaneously introducing robustness to outliers. In a related vein, a second, complementary alteration is proposed to further build invariance to contaminated or dirty samples via a data augmentation process that amounts to recycling. In brief, to the extent that the VAE is legitimately a representative generative model, then each output from the decoder should closely resemble an authentic sample, which can then be resubmitted as a novel input ad infinitum. Moreover, this can be accomplished via special recurrent connections without the need for additional parameters to be trained. We evaluate these proposals on multiple practical outlier-removal and generative modeling tasks involving nonlinear low-dimensional manifolds, demonstrating considerable improvements over existing algorithms.

[1]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[2]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[3]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[4]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[5]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[6]  Wen Gao,et al.  Maximal Sparsity with Deep Networks? , 2016, NIPS.

[7]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[8]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[9]  Bhaskar D. Rao,et al.  Subset selection in noise based on diversity measure minimization , 2003, IEEE Trans. Signal Process..

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  Gang Hua,et al.  Connections with Robust PCA and the Role of Emergent Sparsity in Variational Autoencoder Models , 2018, J. Mach. Learn. Res..

[12]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[13]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[14]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[15]  Gang Hua,et al.  Unsupervised One-Class Learning for Automatic Outlier Removal , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Gang Hua,et al.  Green Generative Modeling: Recycling Dirty Data using Recurrent Variational Autoencoders , 2017, UAI.

[17]  Bhaskar D. Rao,et al.  Variational EM Algorithms for Non-Gaussian Latent Variable Models , 2005, NIPS.

[18]  David P. Wipf,et al.  Iterative Reweighted 1 and 2 Methods for Finding Sparse Solutions , 2010, IEEE J. Sel. Top. Signal Process..

[19]  Jonathan Le Roux,et al.  Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures , 2014, ArXiv.

[20]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[21]  David P. Wipf,et al.  From Bayesian Sparsity to Gated Recurrent Nets , 2017, NIPS.

[22]  David P. Wipf,et al.  Non-Convex Rank Minimization via an Empirical Bayesian Approach , 2012, UAI.

[23]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[24]  Guillermo Sapiro,et al.  Learning Efficient Sparse and Low Rank Models , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[26]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[27]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[28]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[29]  Jitendra Malik,et al.  Learning to Optimize , 2016, ICLR.

[30]  Gang Hua,et al.  Learning Discriminative Reconstructions for Unsupervised Outlier Removal , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  R. Gnanadesikan,et al.  Probability plotting methods for the analysis of data. , 1968, Biometrika.