POG: Personalized Outfit Generation for Fashion Recommendation at Alibaba iFashion

Increasing demand for fashion recommendation raises a lot of challenges for online shopping platforms and fashion communities. In particular, there exist two requirements for fashion outfit recommendation: the Compatibility of the generated fashion outfits, and the Personalization in the recommendation process. In this paper, we demonstrate these two requirements can be satisfied via building a bridge between outfit generation and recommendation. Through large data analysis, we observe that people have similar tastes in individual items and outfits. Therefore, we propose a Personalized Outfit Generation (POG) model, which connects user preferences regarding individual items and outfits with Transformer architecture. Extensive offline and online experiments provide strong quantitative evidence that our method outperforms alternative methods regarding both compatibility and personalization metrics. Furthermore, we deploy POG on a platform named Dida in Alibaba to generate personalized outfits for the users of the online application iFashion. This work represents a first step towards an industrial-scale fashion outfit generation and recommendation solution, which goes beyond generating outfits based on explicit queries, or merely recommending from existing outfit pools. As part of this work, we release a large-scale dataset consisting of 1.01 million outfits with rich context information, and 0.28 billion user click actions from 3.57 million users. To the best of our knowledge, this dataset is the largest, publicly available, fashion related dataset, and the first to provide user behaviors relating to both outfits and fashion items.

[1]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[2]  Qiang Chen,et al.  Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[4]  Chen Fang,et al.  Visually-Aware Fashion Recommendation and Design with Generative Image Models , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[5]  Jun Ma,et al.  NeuroStylist: Neural Compatibility Modeling for Clothing Matching , 2017, ACM Multimedia.

[6]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[7]  David A. Forsyth,et al.  Learning Type-Aware Embeddings for Fashion Compatibility , 2018, ECCV.

[8]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[9]  Larry S. Davis,et al.  Collaborative Fashion Recommendation: A Functional Tensor Factorization Approach , 2015, ACM Multimedia.

[10]  Shuicheng Yan,et al.  Clothes Co-Parsing Via Joint Image Segmentation and Labeling With Application to Clothing Retrieval , 2016, IEEE Transactions on Multimedia.

[11]  Ryosuke Goto,et al.  Outfit Generation and Style Extraction via Bidirectional LSTM and Autoencoder , 2018, ArXiv.

[12]  Julian J. McAuley,et al.  Self-Attentive Sequential Recommendation , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[13]  Ivona Tautkute,et al.  DeepStyle: Multimodal Search Engine for Fashion and Interior Design , 2018, IEEE Access.

[14]  Tamara L. Berg,et al.  Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Serge J. Belongie,et al.  Learning Visual Clothing Style with Heterogeneous Dyadic Co-Occurrences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Jiebo Luo,et al.  Mining Fashion Outfit Composition Using an End-to-End Deep Learning Approach on Set Data , 2016, IEEE Transactions on Multimedia.

[17]  Yu-Gang Jiang,et al.  Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.

[18]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[19]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Svetlana Lazebnik,et al.  Where to Buy It: Matching Street Clothing Photos in Online Shops , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Dik Lun Lee,et al.  Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba , 2018, KDD.

[22]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Wei Liu,et al.  Neural Compatibility Modeling with Attentive Knowledge Distillation , 2018, SIGIR.

[24]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[25]  Peng Jiang,et al.  BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer , 2019, CIKM.

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[28]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[29]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[32]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[33]  Robinson Piramuthu,et al.  Large scale visual recommendations from street fashion images , 2014, KDD.