Analyzing Item Popularity Bias of Music Recommender Systems: Are Different Genders Equally Affected?

Several studies have identified discrepancies between the popularity of items in user profiles and the corresponding recommendation lists. Such behavior, which concerns a variety of recommendation algorithms, is referred to as popularity bias. Existing work predominantly adopts simple statistical measures, such as the difference of mean or median popularity, to quantify popularity bias. Moreover, it does so irrespective of user characteristics other than the inclination to popular content. In this work, in contrast, we propose to investigate popularity differences (between the user profile and recommendation list) in terms of median, a variety of statistical moments, as well as similarity measures that consider the entire popularity distributions (Kullback-Leibler divergence and Kendall’s τ rank-order correlation). This results in a more detailed picture of the characteristics of popularity bias. Furthermore, we investigate whether such algorithmic popularity bias affects users of different genders in the same way. We focus on music recommendation and conduct experiments on the recently released standardized LFM-2b dataset, containing listening profiles of Last.fm users. We investigate the algorithmic popularity bias of seven common recommendation algorithms (five collaborative filtering and two baselines). Our experiments show that (1) the studied metrics provide novel insights into popularity bias in comparison with only using average differences, (2) algorithms less inclined towards popularity bias amplification do not necessarily perform worse in terms of utility (NDCG), (3) the majority of the investigated recommenders intensify the popularity bias of the female users.

[1]  Òscar Celma,et al.  Music Recommendation and Discovery - The Long Tail, Long Fail, and Long Play in the Digital Music Space , 2010 .

[2]  Matthew D. Hoffman,et al.  Variational Autoencoders for Collaborative Filtering , 2018, WWW.

[3]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[4]  Benjamin M. Marlin,et al.  Collaborative Filtering: A Machine Learning Perspective , 2004 .

[5]  Dominik Kowald,et al.  Modeling Popularity and Temporal Drift of Music Genre Preferences , 2020, Trans. Int. Soc. Music. Inf. Retr..

[6]  Dietmar Jannach,et al.  Are we really making much progress? A worrying analysis of recent neural recommendation approaches , 2019, RecSys.

[7]  Dominik Kowald,et al.  The Unfairness of Popularity Bias in Music Recommendation: A Reproducibility Study , 2019, ECIR.

[8]  Bamshad Mobasher,et al.  Managing Popularity Bias in Recommender Systems with Personalized Re-ranking , 2019, FLAIRS.

[9]  George Karypis,et al.  SLIM: Sparse Linear Methods for Top-N Recommender Systems , 2011, 2011 IEEE 11th International Conference on Data Mining.

[10]  Robin Burke,et al.  The Unfairness of Popularity Bias in Recommendation , 2019, RMSE@RecSys.

[11]  Dietmar Jannach,et al.  Beyond "Hitting the Hits": Generating Coherent Music Playlist Continuations with the Right Tracks , 2015, RecSys.

[12]  Bamshad Mobasher,et al.  Controlling Popularity Bias in Learning-to-Rank Recommendation , 2017, RecSys.

[13]  Maria Soledad Pera,et al.  All The Cool Kids, How Do They Fit In?: Popularity and Demographic Biases in Recommender Evaluation and Effectiveness , 2018, FAT.

[14]  Dominik Kowald,et al.  Support the underground: characteristics of beyond-mainstream music listeners , 2021, EPJ Data Science.

[15]  Gianni Fenu,et al.  Connecting User and Item Perspectives in Popularity Debiasing for Collaborative Recommendation , 2021, Inf. Process. Manag..

[16]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[17]  Markus Schedl,et al.  Investigating gender fairness of recommendation algorithms in the music domain , 2021, Inf. Process. Manag..

[18]  James Caverlee,et al.  Measuring and Mitigating Item Under-Recommendation Bias in Personalized Ranking Systems , 2020, SIGIR.

[19]  Harald Steck,et al.  Embarrassingly Shallow Autoencoders for Sparse Data , 2019, WWW.

[20]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.