Factorizing LambdaMART for cold start recommendations

Recommendation systems often rely on point-wise loss metrics such as the mean squared error. However, in real recommendation settings only few items are presented to a user. This observation has recently encouraged the use of rank-based metrics. LambdaMART is the state-of-the-art algorithm in learning to rank which relies on such a metric. Motivated by the fact that very often the users’ and items’ descriptions as well as the preference behavior can be well summarized by a small number of hidden factors, we propose a novel algorithm, LambdaMART matrix factorization (LambdaMART-MF), that learns latent representations of users and items using gradient boosted trees. The algorithm factorizes LambdaMART by defining relevance scores as the inner product of the learned representations of the users and items. We regularise the learned latent representations so that they reflect the user and item manifolds as these are defined by their original feature based descriptors and the preference behavior. We also propose to use a weighted variant of NDCG to reduce the penalty for similar items with large rating discrepancy. We experiment on two very different recommendation datasets, meta-mining and movies-users, and evaluate the performance of LambdaMART-MF, with and without regularization, in the cold start setting as well as in the simpler matrix completion setting. The experiments show that the factorization of LambdaMart brings significant performance improvements both in the cold start and the matrix completion settings. The incorporation of regularisation seems to have a smaller performance impact.

[1]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[2]  Melanie Hilario,et al.  Ontology-Based Meta-Mining of Knowledge Discovery Workflows , 2011, Meta-Learning in Computational Intelligence.

[3]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR Forum.

[4]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[5]  Yisong Yue,et al.  On Using Simultaneous Perturbation Stochastic Approximation for IR Measures , and the Empirical Optimality of LambdaRank , 2007 .

[6]  Alexander J. Smola,et al.  Maximum Margin Matrix Factorization for Collaborative Ranking , 2007 .

[7]  Pinar Donmez,et al.  On the local optimality of LambdaRank , 2009, SIGIR.

[8]  Eyke Hüllermeier,et al.  Preference Learning , 2005, Künstliche Intell..

[9]  Eyke Hllermeier,et al.  Preference Learning , 2010 .

[10]  Pradeep Ravikumar,et al.  Collaborative Filtering with Graph Information: Consistency and Scalable Methods , 2015, NIPS.

[11]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[12]  Shou-De Lin,et al.  LambdaMF: Learning Nonsmooth Ranking Functions in Matrix Factorization Using Lambda , 2015, 2015 IEEE International Conference on Data Mining.

[13]  Yisong Yue,et al.  On Using Simultaneous Perturbation Stochastic Approximation for Learning to Rank, and the Empirical Optimality of LambdaRank , 2007 .

[14]  Qiang Wu,et al.  Learning to Rank Using an Ensemble of Lambda-Gradient Models , 2010, Yahoo! Learning to Rank Challenge.

[15]  Jianmin Wang,et al.  Transfer Learning with Graph Co-Regularization , 2012, IEEE Transactions on Knowledge and Data Engineering.

[16]  Xavier Bresson,et al.  Matrix Completion on Graphs , 2014, NIPS 2014.

[17]  Francis R. Bach,et al.  Low-rank matrix factorization with attributes , 2006, ArXiv.

[18]  Wu-Jun Li,et al.  Relation regularized matrix factorization , 2009, IJCAI 2009.

[19]  Yehuda Koren,et al.  Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[20]  Deepak Agarwal,et al.  fLDA: matrix factorization through latent dirichlet allocation , 2010, WSDM '10.

[21]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[22]  JärvelinKalervo,et al.  IR evaluation methods for retrieving highly relevant documents , 2017 .

[23]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[24]  Melanie Hilario,et al.  Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining , 2012, 2012 IEEE 12th International Conference on Data Mining.

[25]  Qiang Yang,et al.  General Functional Matrix Factorization Using Gradient Boosting , 2013, ICML.