Challenging the Long Tail Recommendation

The success of "infinite-inventory" retailers such as Amazon.com and Netflix has been largely attributed to a "long tail" phenomenon. Although the majority of their inventory is not in high demand, these niche products, unavailable at limited-inventory competitors, generate a significant fraction of total revenue in aggregate. In addition, tail product availability can boost head sales by offering consumers the convenience of "one-stop shopping" for both their mainstream and niche tastes. However, most of existing recommender systems, especially collaborative filter based methods, can not recommend tail products due to the data sparsity issue. It has been widely acknowledged that to recommend popular products is easier yet more trivial while to recommend long tail products adds more novelty yet it is also a more challenging task. In this paper, we propose a novel suite of graph-based algorithms for the long tail recommendation. We first represent user-item information with undirected edge-weighted graph and investigate the theoretical foundation of applying Hitting Time algorithm for long tail item recommendation. To improve recommendation diversity and accuracy, we extend Hitting Time and propose efficient Absorbing Time algorithm to help users find their favorite long tail items. Finally, we refine the Absorbing Time algorithm and propose two entropy-biased Absorbing Cost algorithms to distinguish the variation on different user-item rating pairs, which further enhances the effectiveness of long tail recommendation. Empirical experiments on two real life datasets show that our proposed algorithms are effective to recommend long tail items and outperform state-of-the-art recommendation techniques.

[1]  Edward Y. Chang,et al.  Collaborative filtering for orkut communities: discovery of user latent behavior , 2009, WWW '09.

[2]  Matthew Brand,et al.  A Random Walks Perspective on Maximizing Satisfaction and Profit , 2005, SDM.

[3]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[4]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[5]  Fei Wang,et al.  Recommendation on Item Graphs , 2006, Sixth International Conference on Data Mining (ICDM'06).

[6]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[7]  Deepak Agarwal,et al.  fLDA: matrix factorization through latent dirichlet allocation , 2010, WSDM '10.

[8]  Michael R. Lyu,et al.  Diversifying Query Suggestion Results , 2010, AAAI.

[9]  Sean M. McNee,et al.  Being accurate is not enough: how accuracy metrics have hurt recommender systems , 2006, CHI Extended Abstracts.

[10]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[11]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[12]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  François Fouss,et al.  A novel way of computing similarities between nodes of a graph, with application to collaborative recommendation , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[14]  John Riedl,et al.  An Empirical Analysis of Design Choices in Neighborhood-Based Collaborative Filtering Algorithms , 2002, Information Retrieval.

[15]  Chris Anderson,et al.  The Long Tail: Why the Future of Business is Selling Less of More , 2006 .

[16]  Marco Saerens,et al.  A novel way of computing similarities between nodes of a graph, with application to collaborative filtering and subspace projection of the graph nodes , 2006 .

[17]  Qing Zhang,et al.  Assessing and ranking structural correlations in graphs , 2011, SIGMOD '11.

[18]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[19]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[20]  Christos Faloutsos,et al.  TANGENT: a novel, 'Surprise me', recommendation algorithm , 2009, KDD.

[21]  Nenghai Yu,et al.  Soft-Constraint Based Online LDA for Community Recommendation , 2010, PCM.

[22]  Aditya G. Parameswaran,et al.  Recsplorer: recommendation algorithms based on precedence mining , 2010, SIGMOD Conference.

[23]  Kartik Hosanagar,et al.  Recommender systems and their impact on sales diversity , 2007, EC '07.

[24]  László Lovász,et al.  Random Walks on Graphs: A Survey , 1993 .

[25]  F. Göbel,et al.  Random walks on graphs , 1974 .

[26]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[27]  D. Vere-Jones Markov Chains , 1972, Nature.

[28]  Alain Pirotte,et al.  A novel way of computing dissimilarities between nodes of a graph , 2004 .

[29]  Jonathan L. Herlocker,et al.  A collaborative filtering algorithm and evaluation metric that accurately model the user experience , 2004, SIGIR '04.

[30]  Ralf Krestel,et al.  Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.

[31]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .