Towards Automatic Image Editing: Learning to See another You

Learning the distribution of images in order to generate new samples is a challenging task due to the high dimensionality of the data and the highly non-linear relations that are involved. Nevertheless, some promising results have been reported in the literature recently,building on deep network architectures. In this work, we zoom in on a specific type of image generation: given an image and knowing the category of objects it belongs to (e.g. faces), our goal is to generate a similar and plausible image, but with some altered attributes. This is particularly challenging, as the model needs to learn to disentangle the effect of each attribute and to apply a desired attribute change to a given input image, while keeping the other attributes and overall object appearance intact. To this end, we learn a convolutional network, where the desired attribute information is encoded then merged with the encoded image at feature map level. We show promising results, both qualitatively as well as quantitatively, in the context of a retrieval experiment, on two face datasets (MultiPie and CAS-PEAL-R1).

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Xu Jia,et al.  Swap Retrieval: Retrieving Images of Cats When the Query Shows a Dog , 2015, ICMR.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Jan Kautz,et al.  Modeling object appearance using Context-Conditioned Component Analysis , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Xiaogang Wang,et al.  Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations , 2014, NIPS.

[7]  Jan Kautz,et al.  Visio-lization: generating novel facial images , 2009, ACM Trans. Graph..

[8]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[9]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[10]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Tijmen Tieleman,et al.  Optimizing Neural Networks that Generate Iimages , 2014 .

[13]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Scott E. Reed,et al.  Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.

[15]  Jon Gauthier Conditional generative adversarial nets for convolutional face generation , 2015 .

[16]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Wen Gao,et al.  The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[18]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[19]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[20]  Sébastien Marcel,et al.  A Scalable Formulation of Probabilistic Linear Discriminant Analysis: Applied to Face Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[22]  Thomas Brox,et al.  Single-view to Multi-view: Reconstructing Unseen Views with a Convolutional Network , 2015, ArXiv.

[23]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[24]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[25]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[26]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[27]  Du-Sik Park,et al.  Rotating your face using multi-task deep neural network , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[29]  Xiaogang Wang,et al.  Deep Learning Identity-Preserving Face Space , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[31]  Matt J. Kusner,et al.  Deep Manifold Traversal: Changing Labels with Convolutional Features , 2015, ArXiv.