Discovering Style Trends Through Deep Visually Aware Latent Item Embeddings

In this paper, we explore Latent Dirichlet Allocation (LDA) [1] and Polylingual Latent Dirichlet Allocation (PolyLDA) [6], as a means to discover trending styles in Overstock1 from deep visual semantic features transferred from a pretrained convolutional neural network and text-based item attributes. To utilize deep visual semantic features in conjunction with LDA, we develop a method for creating a bag of words representation of unrolled image vectors. By viewing the channels within the convolutional layers of a Resnet-50 [2] as being representative of a word, we can index these activations to create visual documents. We then train LDA over these documents to discover the latent style in the images. We also incorporate text-based data with PolyLDA, where each representation is viewed as an independent language attempting to describe the same style. The resulting topics are shown to be excellent indicators of visual style across our platform.