On Sampling Top-K Recommendation Evaluation

Recently, Rendle has warned that the use of sampling-based top-k metrics might not suffice. This throws a number of recent studies on deep learning-based recommendation algorithms, and classic non-deep-learning algorithms using such a metric, into jeopardy. In this work, we thoroughly investigate the relationship between the sampling and global top-K Hit-Ratio (HR, or Recall), originally proposed by Koren[2] and extensively used by others. By formulating the problem of aligning sampling top-k ($SHR@k$) and global top-K (HR@K) Hit-Ratios through a mapping function f, so that SHR@k~ HR@f(k), we demonstrate both theoretically and experimentally that the sampling top-k Hit-Ratio provides an accurate approximation of its global (exact) counterpart, and can consistently predict the correct winners (the same as indicate by their corresponding global Hit-Ratios).

[1]  Philip S. Yu,et al.  Leveraging Meta-path based Context for Top- N Recommendation with A Neural Co-Attention Model , 2018, KDD.

[2]  Yixin Cao,et al.  Explainable Reasoning over Knowledge Graphs for Recommendation , 2018, AAAI.

[3]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4]  T. Hughes,et al.  Signals and systems , 2006, Genome Biology.

[5]  Germinal Cocho,et al.  Fitting Ranked Linguistic Data with Two-Parameter Functions , 2010, Entropy.

[6]  Xiangnan He,et al.  A Generic Coordinate Descent Framework for Learning from Implicit Feedback , 2016, WWW.

[7]  Yi Tay,et al.  Deep Learning based Recommender System: A Survey and New Perspectives , 2018 .

[8]  Deborah Estrin,et al.  Unbiased offline recommender evaluation for missing-not-at-random implicit feedback , 2018, RecSys.

[9]  Xiaodong He,et al.  A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems , 2015, WWW.

[10]  John R. Anderson,et al.  Efficient Training on Very Large Corpora via Gramian Estimation , 2018, ICLR.

[11]  Deborah Estrin,et al.  OpenRec: A Modular Framework for Extensible and Adaptable Recommendation Algorithms , 2018, WSDM.

[12]  Steffen Rendle Evaluation Metrics for Item Recommendation under Sampling , 2019, ArXiv.

[13]  Dietmar Jannach,et al.  Are we really making much progress? A worrying analysis of recent neural recommendation approaches , 2019, RecSys.

[14]  Matthew D. Hoffman,et al.  Variational Autoencoders for Collaborative Filtering , 2018, WWW.

[15]  Harald Steck,et al.  Embarrassingly Shallow Autoencoders for Sparse Data , 2019, WWW.

[16]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[17]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[18]  Bin Shen,et al.  Collaborative Memory Network for Recommendation Systems , 2018, SIGIR.

[19]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[20]  Yehuda Koren,et al.  On the Difficulty of Evaluating Baselines: A Study on Recommender Systems , 2019, ArXiv.

[21]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.