Attributes Consistent Faces Generation Under Arbitrary Poses

Generating visual images automatically given semantic attributes description helps building friendly human-machine interface as criminal suspects depicting and customer-oriented product design. However, this is a hard task since there exist great semantic and structure gaps between attributes description and visual images, which further cause great inference uncertainty in transforming from attribute description to vivid visual images. We aim to reduce the posterior distribute complexity given attributes (P(I|A)) by exploring face structure knowledge and imposing semantic consistency constraint in an end-to-end learning fashion. Our contributions are three-fold: (1) we address the semantic and structure consistent problem in attribute-conditioned image generation; (2) the proposed method can generate attribute consistent face images with high quality in both detailed texture and also clear structure; (3) we provide an interface for generating attribute consistent images with diverse poses.

[1]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[4]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[6]  Alexandre Alahi,et al.  From Bits to Images: Inversion of Local Binary Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Xin Jin,et al.  Face alignment by robust discriminative Hough voting , 2016, Pattern Recognit..

[8]  David A. Forsyth,et al.  Describing objects by their attributes , 2009, CVPR.

[9]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Shuiwang Ji,et al.  Multi-Stage Variational Auto-Encoders for Coarse-to-Fine Image Generation , 2017, SDM.

[12]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[13]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  LinLin Shen,et al.  Deep Feature Consistent Variational Autoencoder , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[16]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[17]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[19]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[20]  Gang Hua,et al.  CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[24]  Leonidas J. Guibas,et al.  Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Chun Chen,et al.  Face sketch-to-photo synthesis from simple line drawing , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[28]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[29]  Gang Hua,et al.  Labeled Faces in the Wild: A Survey , 2016 .

[30]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[31]  Antonio Torralba,et al.  HOGgles: Visualizing Object Detection Features , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[33]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[35]  Qi Guo,et al.  Attribute-controlled face photo synthesis from simple line drawing , 2017, 2017 IEEE International Conference on Image Processing (ICIP).