Calibrated recommendations

When a user has watched, say, 70 romance movies and 30 action movies, then it is reasonable to expect the personalized list of recommended movies to be comprised of about 70% romance and 30% action movies as well. This important property is known as calibration, and recently received renewed attention in the context of fairness in machine learning. In the recommended list of items, calibration ensures that the various (past) areas of interest of a user are reflected with their corresponding proportions. Calibration is especially important in light of the fact that recommender systems optimized toward accuracy (e.g., ranking metrics) in the usual offline-setting can easily lead to recommendations where the lesser interests of a user get crowded out by the user's main interests-which we show empirically as well as in thought-experiments. This can be prevented by calibrated recommendations. To this end, we outline metrics for quantifying the degree of calibration, as well as a simple yet effective re-ranking algorithm for post-processing the output of recommender systems.

[1]  Saul Vargas,et al.  Explicit relevance models in intent-oriented information retrieval diversification , 2012, SIGIR '12.

[2]  Bert Huang,et al.  Beyond Parity: Fairness Objectives for Collaborative Filtering , 2017, NIPS.

[3]  Harald Steck,et al.  Training and testing of recommender systems on data missing not at random , 2010, KDD.

[4]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[5]  Mi Zhang,et al.  Avoiding monotony: improving the diversity of recommendation lists , 2008, RecSys '08.

[6]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[7]  Yusuke Shinohara A submodular optimization approach to sentence set selection , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Indre Zliobaite,et al.  A survey on measuring indirect discrimination in machine learning , 2015, ArXiv.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Neil J. Hurley,et al.  Novelty and Diversity in Top-N Recommendation -- Analysis and Evaluation , 2011, TOIT.

[11]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[12]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[13]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[14]  W. Bruce Croft,et al.  Diversity by proportionality: an election-based approach to search result diversification , 2012, SIGIR '12.

[15]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[16]  James Bennett,et al.  The Netflix Prize , 2007 .

[17]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[18]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  C. Martin 2015 , 2015, Les 25 ans de l’OMC: Une rétrospective en photos.

[20]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[21]  Guy Shani,et al.  Evaluating Recommendation Systems , 2011, Recommender Systems Handbook.

[22]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[23]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[24]  Ulrich Paquet,et al.  Bayesian Low-Rank Determinantal Point Processes , 2016, RecSys.

[25]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[26]  Choon Hui Teo,et al.  Adaptive, Personalized Diversity for Visual Discovery , 2016, RecSys.

[27]  Zheng Wen,et al.  Diversified Utility Maximization for Recommendations , 2014, RecSys Posters.

[28]  Saul Vargas,et al.  Coverage, redundancy and size-awareness in genre diversity for recommender systems , 2014, RecSys '14.

[29]  Xiaoyan Zhu,et al.  Promoting Diversity in Recommendation by Entropy Regularizer , 2013, IJCAI.

[30]  Hanning Zhou,et al.  Improving the Diversity of Top-N Recommendation via Determinantal Point Process , 2017, ArXiv.

[31]  S. M. García,et al.  2014: , 2020, A Party for Lazarus.

[32]  John Langford,et al.  Off-policy evaluation for slate recommendation , 2016, NIPS.

[33]  Aditya Bhaskara,et al.  Linear Relaxations for Finding Diverse Elements in Metric Spaces , 2016, NIPS.

[34]  Zheng Wen,et al.  Optimal Greedy Diversity for Recommendation , 2015, IJCAI.

[35]  Qiang Yang,et al.  One-Class Collaborative Filtering , 2008, 2008 Eighth IEEE International Conference on Data Mining.