Unified relevance models for rating prediction in collaborative filtering

Collaborative filtering aims at predicting a user's interest for a given item based on a collection of user profiles. This article views collaborative filtering as a problem highly related to information retrieval, drawing an analogy between the concepts of users and items in recommender systems and queries and documents in text retrieval. We present a probabilistic user-to-item relevance framework that introduces the concept of relevance into the related problem of collaborative filtering. Three different models are derived, namely, a user-based, an item-based, and a unified relevance model, and we estimate their rating predictions from three sources: the user's own ratings for different items, other users' ratings for the same item, and ratings from different but similar users for other but similar items. To reduce the data sparsity encountered when estimating the probability density function of the relevance variable, we apply the nonparametric (data-driven) density estimation technique known as the Parzen-window method (or kernel-based density estimation). Using a Gaussian window function, the similarity between users and/or items would, however, be based on Euclidean distance. Because the collaborative filtering literature has reported improved prediction accuracy when using cosine similarity, we generalize the Parzen-window method by introducing a projection kernel. Existing user-based and item-based approaches correspond to two simplified instantiations of our framework. User-based and item-based collaborative filterings represent only a partial view of the prediction problem, where the unified relevance model brings these partial views together under the same umbrella. Experimental results complement the theoretical insights with improved recommendation accuracy. The unified model is more robust to data sparsity because the different types of ratings are used in concert.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[3]  P de VriesArjen,et al.  Unified relevance models for rating prediction in collaborative filtering , 2008 .

[4]  ChengXiang Zhai,et al.  Probabilistic Relevance Models Based on Document and Query Generation , 2003 .

[5]  C. Tomasi Estimating Gaussian Mixture Densities with EM – A Tutorial , 2004 .

[6]  Hsinchun Chen,et al.  Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering , 2004, TOIS.

[7]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[8]  Eric Horvitz,et al.  Collaborative Filtering by Personality Diagnosis: A Hybrid Memory and Model-Based Approach , 2000, UAI.

[9]  Qiang Yang,et al.  Scalable collaborative filtering using cluster-based smoothing , 2005, SIGIR '05.

[10]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[11]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[12]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[13]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[16]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[17]  S. Robertson The probability ranking principle in IR , 1997 .

[18]  Fionn Murtagh,et al.  Pattern Classification, by Richard O. Duda, Peter E. Hart, and David G. Stork , 2001, J. Classif..

[19]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[20]  Robert Wing Pong Luk,et al.  A Generative Theory of Relevance , 2008, The Information Retrieval Series.

[21]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[22]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[23]  David G. Stork,et al.  Pattern Classification , 1973 .

[24]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[25]  Pavel Pudil,et al.  Road sign classification using Laplace kernel classifier , 2000, Pattern Recognit. Lett..

[26]  Rong Hu,et al.  A Hybrid User and Item-Based Collaborative Filtering with Smoothing on Sparse Data , 2006, 16th International Conference on Artificial Reality and Telexistence--Workshops (ICAT'06).

[27]  Victor Lavrenko,et al.  A Generative Theory of Relevance , 2008, The Information Retrieval Series.

[28]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[29]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[30]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[31]  Joseph A. Konstan,et al.  Understanding and improving automated collaborative filtering systems , 2000 .

[32]  Luo Si,et al.  Flexible Mixture Model for Collaborative Filtering , 2003, ICML.

[33]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[34]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[35]  S. Robertson The unified model revisited , 2003 .

[36]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[37]  Kwok-Wai Cheung,et al.  Learning User Similarity and Rating Style for Collaborative Recommendation , 2003, Information Retrieval.

[38]  John F. Canny,et al.  Collaborative filtering with privacy via factor analysis , 2002, SIGIR '02.

[39]  Stephen E. Robertson,et al.  A new unified probabilistic model , 2004, J. Assoc. Inf. Sci. Technol..

[40]  Stephen E. Robertson,et al.  On Event Spaces and Probabilistic Models in Information Retrieval , 2005, Information Retrieval.

[41]  Luo Si,et al.  An automatic weighting scheme for collaborative filtering , 2004, SIGIR '04.

[42]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[43]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[44]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[45]  Jun Wang,et al.  A User-Item Relevance Model for Log-Based Collaborative Filtering , 2006, ECIR.

[46]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[47]  Hanqing Lu,et al.  Improving kernel Fisher discriminant analysis for face recognition , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[49]  David Bodoff A re-unification of two competing models for document retrieval , 1999 .

[50]  David Bodoff,et al.  A Re-Unification of Two Competing Models for Document Retrieval , 1997, J. Am. Soc. Inf. Sci..