Scalable and Explainable Outfit Generation

We present an end-to-end system for learning outfit recommendations. The core problem we address is how a customer can receive clothing/accessory recommendations based on a current outfit and what type of item the customer wishes to add to the outfit. Using a repository of coherent and stylish outfits, we leverage self-attention to learn a mapping from the current outfit and the customer-requested category to a visual descriptor output. This output is then fed into nearest-neighbor-based visual search, which, during training, is learned via triplet loss and mini-batch retrievals. At inference time, we use a beam search with a desired outfit composition to generate outfits at scale. Moreover, the attention networks provide a diagnostic look into the recommendation process, serving as a fashion-based sanity check.

[1]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[3]  Larry S. Davis,et al.  Fashion Outfit Complementary Item Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[5]  Yu-Gang Jiang,et al.  Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.