Adversarially Approximated Autoencoder for Image Generation and Manipulation

Regularized autoencoders learn the latent codes, a structure with the regularization under the distribution, which enables them the capability to infer the latent codes given observations and generate new samples given the codes. However, they are sometimes ambiguous as they tend to produce reconstructions that are not necessarily a faithful reproduction of the inputs. The main reason is to enforce the learned latent code distribution to match a prior distribution while the true distribution remains unknown. To improve the reconstruction quality and learn the latent space a manifold structure, this paper presents a novel approach using the adversarially approximated autoencoder (AAAE) to investigate the latent codes with adversarial approximation. Instead of regularizing the latent codes by penalizing on the distance between the distributions of the model and the target, AAAE learns the autoencoder flexibly and approximates the latent space with a simpler generator. The ratio is estimated using a generative adversarial network to enforce the similarity of the distributions. In addition, the image space is regularized with an additional adversarial regularizer. The proposed approach unifies two deep generative models for both latent space inference and diverse generation. The learning scheme is realized without regularization on the latent codes, which also encourages faithful reconstruction. Extensive validation experiments on four real-world datasets demonstrate the superior performance of AAAE. In comparison to the state-of-the-art approaches, AAAE generates samples with better quality and shares the properties of a regularized autoencoder with a nice latent manifold structure.

[1]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[2]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[4]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[5]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[6]  Dustin Tran,et al.  Hierarchical Implicit Models and Likelihood-Free Variational Inference , 2017, NIPS.

[7]  Robert Pless,et al.  Deep Feature Interpolation for Image Content Changes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[9]  Brendan J. Frey,et al.  PixelGAN Autoencoders , 2017, NIPS.

[10]  Ling Shao,et al.  Video Salient Object Detection via Fully Convolutional Networks , 2017, IEEE Transactions on Image Processing.

[11]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[12]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Zhe Gan,et al.  Adversarial Symmetric Variational Autoencoder , 2017, NIPS.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Lawrence Carin,et al.  ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching , 2017, NIPS.

[17]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[18]  Wenguan Wang,et al.  Deep Visual Attention Prediction , 2017, IEEE Transactions on Image Processing.

[19]  Eric P. Xing,et al.  On Unifying Deep Generative Models , 2017, ICLR.

[20]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[21]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[22]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[23]  Zhanyi Hu,et al.  Learning Depth From Single Images With Deep Neural Network Embedding Focal Length , 2018, IEEE Transactions on Image Processing.

[24]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders for Generating Discrete Structures , 2017, ArXiv.

[25]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[27]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[28]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[30]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[31]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[32]  Guanghui Wang,et al.  MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[33]  Meng Wang,et al.  Deep Aging Face Verification With Large Gaps , 2016, IEEE Transactions on Multimedia.

[34]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[35]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[36]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders , 2017, ICML.

[37]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[39]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[40]  Nikolaos Doulamis,et al.  Deep Learning for Computer Vision: A Brief Review , 2018, Comput. Intell. Neurosci..

[41]  Qi Tian,et al.  Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning , 2017, IEEE Transactions on Multimedia.

[42]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[43]  Ferenc Huszár,et al.  Variational Inference using Implicit Distributions , 2017, ArXiv.

[44]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Haibin Ling,et al.  A Deep Network Solution for Attention and Aesthetics Aware Photo Cropping , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[47]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[48]  Shaowei Liu,et al.  General Knowledge Embedded Image Representation Learning , 2018, IEEE Transactions on Multimedia.

[49]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[50]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[51]  Lei He,et al.  Spindle-Net: CNNs for Monocular Depth Inference with Dilation Kernel Method , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[52]  Yann LeCun,et al.  Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.

[53]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[54]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[55]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[56]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[58]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[59]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[60]  Ming-Ai Li,et al.  A novel feature extraction method for scene recognition based on Centered Convolutional Restricted Boltzmann Machines , 2015, Neurocomputing.

[61]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[62]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[63]  Guoping Qiu,et al.  Learning Based Image Transformation Using Convolutional Neural Networks , 2018, IEEE Access.