Boosting collaborative filtering based on statistical prediction errors

User-based collaborative filtering methods typically predict a user's item ratings as a weighted average of the ratings given by similar users, where the weight is proportional to the user similarity. Therefore, the accuracy of user similarity is the key to the success of the recommendation, both for selecting neighborhoods and computing predictions. However, the computed similarities between users are somewhat inaccurate due to data sparsity. For a given user, the set of neighbors selected for predicting ratings on different items typically exhibit overlap. Thus, error terms contributing to rating predictions will tend to be shared, leading to correlation of the prediction errors. Through a set of case studies, we discovered that for a given user, the prediction errors on different items are correlated to the similarities of the corresponding items, and to the degree to which they share common neighbors. We propose a framework to improve prediction accuracy based on these statistical prediction errors. Two different strategies to estimate the prediction error on a desired item are proposed. Our experiments show that these approaches improve the prediction accuracy of standard user based methods significantly, and they outperform other state-of-the-art methods.

[1]  Luo Si,et al.  An automatic weighting scheme for collaborative filtering , 2004, SIGIR '04.

[2]  Mark Claypool,et al.  Combining Content-Based and Collaborative Filters in an Online Newspaper , 1999, SIGIR 1999.

[3]  Yi Zhang,et al.  Efficient bayesian hierarchical user modeling for recommendation system , 2007, SIGIR.

[4]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[6]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[7]  J. Kiefer Conditional Confidence Statements and Confidence Estimators , 1977 .

[8]  John Riedl,et al.  An Empirical Analysis of Design Choices in Neighborhood-Based Collaborative Filtering Algorithms , 2002, Information Retrieval.

[9]  Konstantinos G. Margaritis,et al.  Using SVD and demographic data for the enhancement of generalized Collaborative Filtering , 2007, Inf. Sci..

[10]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[11]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[12]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[13]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[14]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[15]  Pearl Pu,et al.  A recursive prediction algorithm for collaborative filtering recommender systems , 2007, RecSys '07.

[16]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[17]  Qiang Yang,et al.  Scalable collaborative filtering using cluster-based smoothing , 2005, SIGIR '05.