TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network

In this work, we present the Text Conditioned Auxiliary Classifier Generative Adversarial Network, (TAC-GAN) a text to image Generative Adversarial Network (GAN) for synthesizing images from their text descriptions. Former approaches have tried to condition the generative process on the textual data; but allying it to the usage of class information, known to diversify the generated samples and improve their structural coherence, has not been explored. We trained the presented TAC-GAN model on the Oxford-102 dataset of flowers, and evaluated the discriminability of the generated images with Inception-Score, as well as their diversity using the Multi-Scale Structural Similarity Index (MS-SSIM). Our approach outperforms the state-of-the-art models, i.e., its inception score is 3.45, corresponding to a relative increase of 7.8% compared to the recently introduced StackGan. A comparison of the mean MS-SSIM scores of the training and generated samples per class shows that our approach is able to generate highly diverse images with an average MS-SSIM of 0.14 over all generated classes.

[1]  Minh N. Do,et al.  Semantic Image Inpainting with Perceptual and Contextual Losses , 2016, ArXiv.

[2]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[4]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[5]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[6]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[11]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[12]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[14]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[15]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[16]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[18]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[19]  Abhinav Gupta,et al.  Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[20]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[21]  Zhou Wang,et al.  Group MAD Competition? A New Methodology to Compare Objective Image Quality Models , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  Ruslan Salakhutdinov,et al.  Generating Images from Captions with Attention , 2015, ICLR.

[26]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[27]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.