Show Me The Best Outfit for A Certain Scene: A Scene-aware Fashion Recommender System

Fashion recommendation (FR) has received increasing attention in the research of new types of recommender systems. Existing fashion recommender systems (FRSs) typically focus on clothing item suggestions for users in three scenarios: 1) how to best recommend fashion items preferred by users; 2) how to best compose a complete outfit, and 3) how to best complete a clothing ensemble. However, current FRSs often overlook an important aspect when making FR, that is, the compatibility of the clothing item or outfit recommendations is highly dependent on the scene context. To this end, we propose the scene-aware fashion recommender system (SAFRS), which uncovers a hitherto unexplored avenue where scene information is taken into account when constructing the FR model. More specifically, our SAFRS addresses this problem by encoding scene and outfit information in separation attention encoders and then fusing the resulting feature embeddings via a novel scene-aware compatibility score function. Extensive qualitative and quantitative experiments are conducted to show that our SAFRS model outperforms all baselines for every evaluated metric.

[1]  G. Medioni,et al.  OutfitTransformer: Outfit Representations for Fashion Recommendation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  G. Medioni,et al.  OutfitTransformer: Learning Outfit Representations for Fashion Recommendation , 2022, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[3]  Chongyang Shi,et al.  Tripartite Collaborative Filtering with Observability and Selection for Debiasing Rating Estimation on Missing-Not-at-Random Data , 2021, AAAI.

[4]  Xiang Li,et al.  SceneRec: Scene-Based Graph Neural Networks for Recommender Systems , 2021, EDBT.

[5]  Longbing Cao,et al.  Hierarchical Attentive Transaction Embedding With Intra- and Inter-Transaction Dependencies for Next-Item Recommendation , 2020, IEEE Intelligent Systems.

[6]  Claire Cardie,et al.  Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset , 2020, ECCV.

[7]  Larry S. Davis,et al.  Fashion Outfit Complementary Item Retrieval , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Faisal Z. Qureshi,et al.  EdgeConnect: Structure Guided Image Inpainting using Edge Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[10]  Kate Saenko,et al.  Learning Similarity Conditions Without Explicit Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Xin Wang,et al.  Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network , 2019, ACM Multimedia.

[12]  Kan Li,et al.  Enhancing Fashion Recommendation with Visual Compatibility Relationship , 2019, WWW.

[13]  David Vázquez,et al.  Context-Aware Visual Compatibility Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[15]  Jure Leskovec,et al.  Complete the Look: Scene-Based Complementary Product Recommendation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[17]  Longbing Cao,et al.  Interpretable Recommendation via Attraction Modeling: Learning Multilevel Attractiveness over Multimodal Movie Contents , 2018, IJCAI.

[18]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Longbing Cao,et al.  Attention-Based Transactional Context Embedding for Next-Item Recommendation , 2018, AAAI.

[20]  David A. Forsyth,et al.  Learning Type-Aware Embeddings for Fashion Compatibility , 2018, ECCV.

[21]  Dustin Tran,et al.  Image Transformer , 2018, ICML.

[22]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[23]  Yu-Gang Jiang,et al.  Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.

[24]  Yongdong Zhang,et al.  Trip Outfits Advisor: Location-Oriented Clothing Recommendation , 2017, IEEE Transactions on Multimedia.

[25]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.

[26]  V. Jagadeesh,et al.  Large scale visual recommendations from street fashion images , 2014, KDD.

[27]  Changsheng Xu,et al.  Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[30]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Cross-Domain Collaborative Filtering via Bilinear Multilevel Analysis , 2022 .