Generative Attribute Manipulation Scheme for Flexible Fashion Search

In this work, we aim to investigate the practical task of flexible fashion search with attribute manipulation, where users can retrieve the target fashion items by replacing the unwanted attributes of an available query image with the desired ones (e.g., changing the collar attribute from v-neck to round). Although several pioneer efforts have been dedicated to fulfilling the task, they mainly ignore the potential of generative models in enhancing the visual understanding of target fashion items. To this end, we propose an end-to-end generative attribute manipulation scheme, which consists of a generator and a discriminator. The generator works on producing the prototype image that meets the user's requirement of attribute manipulation over the query image with the regularization of visual-semantic consistency and pixel-wise consistency. Besides, the discriminator aims to jointly fulfill the semantic learning towards correct attribute manipulation and adversarial metric learning for fashion search. Pertaining to the adversarial metric learning, we provide two general paradigms: the pair-based scheme and the triplet-based scheme, where the fake generated prototype images that closely resemble the ground truth images of target items are incorporated as hard negative samples to boost the model performance. Extensive experiments on two real-world datasets verify the effectiveness of our scheme.

[1]  Shuhui Jiang,et al.  Deep Bi-directional Cross-triplet Embedding for Cross-Domain Clothing Retrieval , 2016, ACM Multimedia.

[2]  Yike Guo,et al.  Semantic Image Synthesis via Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Linmei Hu,et al.  Virtually Trying on New Clothing with Arbitrary Poses , 2019, ACM Multimedia.

[4]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[5]  Weinan Zhang,et al.  Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances , 2018, SIGIR.

[6]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Jo Yew Tham,et al.  Learning Attribute Representations with Localization for Flexible Fashion Search , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Yue Gao,et al.  Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval , 2013, ACM Multimedia.

[9]  Shih-Fu Chang,et al.  Visual Translation Embedding Network for Visual Relation Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Sang-goo Lee,et al.  Fashion Attributes-to-Image Synthesis Using Attention-Based Generative Adversarial Network , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Kwang In Kim,et al.  Unsupervised Attention-guided Image to Image Translation , 2018, NeurIPS.

[13]  Larry S. Davis,et al.  VITON: An Image-Based Virtual Try-on Network , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Shiguang Shan,et al.  AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[15]  Larry S. Davis,et al.  Automatic Spatially-Aware Fashion Concept Discovery , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Bohyung Han,et al.  Stochastic Class-Based Hard Example Mining for Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Liqiang Nie,et al.  Supervised Hierarchical Cross-Modal Hashing , 2019, SIGIR.

[18]  Shangsong Liang,et al.  Unsupervised Semantic Generative Adversarial Networks for Expert Retrieval , 2019, WWW.

[19]  Xudong Lin,et al.  Deep Adversarial Metric Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[21]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Georgios Balikas,et al.  On a Topic Model for Sentences , 2016, SIGIR.

[23]  Q. Liu,et al.  FashionGAN: Display your fashion design using Conditional Generative Adversarial Nets , 2018, Comput. Graph. Forum.

[24]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[25]  Nicu Sebe,et al.  Viraliency: Pooling Local Virality , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  M. de Rijke,et al.  Improving Outfit Recommendation with Co-supervision of Fashion Generation , 2019, WWW.

[28]  Meng Wang,et al.  Multi-View Object Retrieval via Multi-Scale Topic Models , 2016, IEEE Transactions on Image Processing.

[29]  Meng Liu,et al.  Attentive Moment Retrieval in Videos , 2018, SIGIR.

[30]  Qiang Chen,et al.  Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Qi Tian,et al.  Cross-modal Moment Localization in Videos , 2018, ACM Multimedia.

[33]  Shuicheng Yan,et al.  Deep Search with Attribute-aware Deep Network , 2014, ACM Multimedia.

[34]  Liqing Zhang,et al.  Demand-adaptive Clothing Image Retrieval Using Hybrid Topic Model , 2016, ACM Multimedia.

[35]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[36]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[37]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Adriana Kovashka,et al.  WhittleSearch: Image search with relative attribute feedback , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[41]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[42]  Bo Zhao,et al.  Memory-Augmented Attribute Manipulation Networks for Interactive Fashion Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Yang Song,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Meng Wang,et al.  Coherent Semantic-Visual Indexing for Large-Scale Image Retrieval in the Cloud , 2017, IEEE Transactions on Image Processing.

[45]  Linlin Liu,et al.  Toward AI fashion design: An Attribute-GAN model for clothing match , 2019, Neurocomputing.