OFF-set: one-pass factorization of feature sets for online recommendation in persistent cold start settings

One of the most challenging recommendation tasks is recommending to a new, previously unseen user. This is known as the user cold start problem. Assuming certain features or attributes of users are known, one approach for handling new users is to initially model them based on their features. Motivated by an ad targeting application, this paper describes an extreme online recommendation setting where the cold start problem is perpetual. Every user is encountered by the system just once, receives a recommendation, and either consumes or ignores it, registering a binary reward. We introduce One-pass Factorization of Feature Sets, 'OFF-Set', a novel recommendation algorithm based on Latent Factor analysis, which models users by mapping their features to a latent space. OFF-Set is able to model non-linear interactions between pairs of features, and updates its model per each recommendation-reward observation in a pure online fashion. We evaluate OFF-Set against several state of the art baselines, and demonstrate its superiority on real ad-targeting data.

[1]  Lars Schmidt-Thieme,et al.  Learning Attribute-to-Feature Mappings for Cold-Start Recommendations , 2010, 2010 IEEE International Conference on Data Mining.

[2]  J. Friedman Stochastic gradient boosting , 2002 .

[3]  Shuang-Hong Yang,et al.  Functional matrix factorizations for cold-start recommendation , 2011, SIGIR.

[4]  Yehuda Koren,et al.  The BellKor Solution to the Netflix Grand Prize , 2009 .

[5]  Yehuda Koren,et al.  Care to comment?: recommendations for commenting on news stories , 2012, WWW.

[6]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[7]  Steffen Rendle Social Network and Click-through Prediction with Factorization Machines , 2012, KDD 2012.

[8]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[9]  Onkar Dabeer,et al.  Analysis of a Collaborative Filter Based on Popularity Amongst Neighbors , 2012, IEEE Transactions on Information Theory.

[10]  Martin Pál,et al.  Contextual Multi-Armed Bandits , 2010, AISTATS.

[11]  Steffen Rendle,et al.  Factorization Machines with libFM , 2012, TIST.

[12]  George Karypis,et al.  Feature-based recommendation system , 2005, CIKM '05.

[13]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[14]  Yehuda Koren,et al.  Web-Scale Media Recommendation Systems , 2012, Proceedings of the IEEE.

[15]  Yehuda Koren,et al.  Adaptive bootstrapping of recommender systems using decision trees , 2011, WSDM '11.

[16]  Yehuda Koren,et al.  Build your own music recommender by modeling internet radio streams , 2012, WWW.

[17]  John Langford,et al.  Efficient Optimal Learning for Contextual Bandits , 2011, UAI.

[18]  Lars Schmidt-Thieme,et al.  Using factorization machines for student modeling , 2012, UMAP Workshops.

[19]  Wei Chu,et al.  Information Services]: Web-based services , 2022 .

[20]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[21]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[22]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[23]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.