Taxonomy discovery for personalized recommendation

Personalized recommender systems based on latent factor models are widely used to increase sales in e-commerce. Such systems use the past behavior of users to recommend new items that are likely to be of interest to them. However, latent factor model suffer from sparse user-item interaction in online shopping data: for a large portion of items that do not have sufficient purchase records, their latent factors cannot be estimated accurately. In this paper, we propose a novel approach that automatically discovers the taxonomies from online shopping data and jointly learns a taxonomy-based recommendation system. Out model is non-parametric and can learn the taxonomy structure automatically from the data. Since the taxonomy allows purchase data to be shared between items, it effectively improves the accuracy of recommending tail items by sharing strength with the more frequent items. Experiments on a large-scale online shopping dataset confirm that our proposed model improves significantly over state-of-the-art latent factor models. Moreover, our model generates high-quality and human readable taxonomies. Finally, using the algorithm-generated taxonomy, our model even outperforms latent factor models based on the human-induced taxonomy, thus alleviating the need for costly manual taxonomy generation.

[1]  Sachin Garg,et al.  Response prediction using collaborative filtering with hierarchies and side-information , 2011, KDD.

[2]  James Bennett,et al.  The Netflix Prize , 2007 .

[3]  Lars Schmidt-Thieme,et al.  Pairwise interaction tensor factorization for personalized tag recommendation , 2010, WSDM '10.

[4]  Andriy Mnih,et al.  Taxonomy-Informed Latent Factor Models for Implicit Feedback , 2012, KDD Cup.

[5]  Richi Nayak,et al.  Exploiting Item Taxonomy for Solving Cold-Start Problem in Recommendation Making , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[6]  Yee Whye Teh,et al.  Learning Label Trees for Probabilistic Modelling of Implicit Feedback , 2012, NIPS.

[7]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[8]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[9]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[10]  Lars Schmidt-Thieme,et al.  Taxonomy-driven computation of product recommendations , 2004, CIKM '04.

[11]  Yehuda Koren,et al.  Collaborative filtering with temporal dynamics , 2009, KDD.

[12]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[13]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[14]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[15]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[16]  Vanja Josifovski,et al.  Supercharging Recommender Systems using Taxonomies for Learning User Purchase Behavior , 2012, Proc. VLDB Endow..

[17]  Alexander J. Smola,et al.  The Nested Chinese Restaurant Franchise Process: User Tracking and Document Modeling , 2013 .