Style in the long tail: discovering unique interests with latent variable models in large scale social E-commerce

Purchasing decisions in many product categories are heavily influenced by the shopper's aesthetic preferences. It's insufficient to simply match a shopper with popular items from the category in question; a successful shopping experience also identifies products that match those aesthetics. The challenge of capturing shoppers' styles becomes more difficult as the size and diversity of the marketplace increases. At Etsy, an online marketplace for handmade and vintage goods with over 30 million diverse listings, the problem of capturing taste is particularly important -- users come to the site specifically to find items that match their eclectic styles. In this paper, we describe our methods and experiments for deploying two new style-based recommender systems on the Etsy site. We use Latent Dirichlet Allocation (LDA) to discover trending categories and styles on Etsy, which are then used to describe a user's "interest" profile. We also explore hashing methods to perform fast nearest neighbor search on a map-reduce framework, in order to efficiently obtain recommendations. These techniques have been implemented successfully at very large scale, substantially improving many key business metrics.

[1]  Robinson Piramuthu,et al.  Style Finder: Fine-Grained Clothing Style Detection and Retrieval , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Nava Tintarev,et al.  Evaluating the effectiveness of explanations for recommender systems , 2012, User Modeling and User-Adapted Interaction.

[3]  Lawrence K. Saul,et al.  10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) A PROBABILISTIC TOPIC MODEL FOR UNSUPERVISED LEARNING OF MUSICAL KEY-PROFILES , 2009 .

[4]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[5]  Jimmy J. Lin,et al.  WTF: the who to follow service at Twitter , 2013, WWW.

[6]  Max Welling,et al.  Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.

[7]  Andrew McGregor,et al.  Efficient Nearest-Neighbor Search in the Probability Simplex , 2013, ICTIR.

[8]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[9]  Benjamin M. Marlin,et al.  Modeling User Rating Profiles For Collaborative Filtering , 2003, NIPS.

[10]  Bryan Pardo,et al.  Classifying paintings by artistic genre: An analysis of features & classifiers , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[11]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[14]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[15]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[16]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[17]  A. M. Madni,et al.  Recommender systems in e-commerce , 2014, 2014 World Automation Congress (WAC).

[18]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[20]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[21]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Ravneet Singh Arora,et al.  Towards automated classification of fine-art painting style: A comparative study , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).