Density Estimation and Incremental Learning of Latent Vector for Generative Autoencoders

In this paper, we treat the image generation task using the autoencoder, a representative latent model. Unlike many studies regularizing the latent variable's distribution by assuming a manually specified prior, we approach the image generation task using an autoencoder by directly estimating the latent distribution. To do this, we introduce 'latent density estimator' which captures latent distribution explicitly and propose its structure. In addition, we propose an incremental learning strategy of latent variables so that the autoencoder learns important features of data by using the structural characteristics of under-complete autoencoder without an explicit regularization term in the objective function. Through experiments, we show the effectiveness of the proposed latent density estimator and the incremental learning strategy of latent variables. We also show that our generative model generates images with improved visual quality compared to previous generative models based on autoencoders.

[1]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[2]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[3]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[4]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[5]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[6]  Francesco Visin,et al.  A guide to convolution arithmetic for deep learning , 2016, ArXiv.

[7]  Hugo Larochelle,et al.  A Deep and Tractable Density Estimator , 2013, ICML.

[8]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[9]  Padhraic Smyth,et al.  Stick-Breaking Variational Autoencoders , 2016, ICLR.

[10]  LinLin Shen,et al.  Deep Feature Consistent Variational Autoencoder , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[12]  Kristen Grauman,et al.  Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[14]  Yoshua Bengio,et al.  A Generative Process for sampling Contractive Auto-Encoders , 2012, ICML 2012.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[17]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[19]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[20]  Hugo Larochelle,et al.  Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[21]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[24]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[29]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Yoshua Bengio,et al.  Better Mixing via Deep Representations , 2012, ICML.

[32]  Yoshua Bengio,et al.  Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.

[33]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[34]  Li Fei-Fei,et al.  Tackling Over-pruning in Variational Autoencoders , 2017, ArXiv.

[35]  Wen-Hsiao Peng,et al.  Learning Priors for Adversarial Autoencoders , 2018, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[36]  C. Bishop Mixture density networks , 1994 .

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[39]  Kristen Grauman,et al.  Fine-Grained Visual Comparisons with Local Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[41]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.