AdaM: Adaptive-Maximum imputation for neighborhood-based collaborative filtering

In the context of collaborative filtering, the well-known data sparsity issue makes two like-minded users have little similarity, and consequently renders the k nearest neighbour rule inapplicable. In this paper, we address the data sparsity problem in the neighbourhood-based CF methods by proposing an Adaptive-Maximum imputation method (AdaM). The basic idea is to identify an imputation area that can maximize the imputation benefit for recommendation purposes, while minimizing the imputation error brought in. To achieve the maximum imputation benefit, the imputation area is determined from both the user and the item perspectives; to minimize the imputation error, there is at least one real rating preserved for each item in the identified imputation area. A theoretical analysis is provided to prove that the proposed imputation method outperforms the conventional neighbourhood-based CF methods through more accurate neighbour identification. Experiment results on benchmark datasets show that the proposed method significantly outperforms the other related state-of-the-art imputation-based methods in terms of accuracy.

[1]  George Karypis,et al.  SLIM: Sparse Linear Methods for Top-N Recommender Systems , 2011, 2011 IEEE 11th International Conference on Data Mining.

[2]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[3]  Christos Faloutsos,et al.  Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining , 2013, ASONAM 2013.

[4]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[5]  Daniel Lemire,et al.  Slope One Predictors for Online Rating-Based Collaborative Filtering , 2007, SDM.

[6]  Yehuda Koren,et al.  Lessons from the Netflix prize challenge , 2007, SKDD.

[7]  Qiang Yang,et al.  Scalable collaborative filtering using cluster-based smoothing , 2005, SIGIR '05.

[8]  Michael R. Lyu,et al.  Effective missing data prediction for collaborative filtering , 2007, SIGIR.

[9]  Richard A. Johnson,et al.  Statistics: Principles and Methods , 1985 .

[10]  George Karypis,et al.  A Comprehensive Survey of Neighborhood-based Recommendation Methods , 2011, Recommender Systems Handbook.

[11]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[12]  Jun Zhang,et al.  The efficient imputation method for neighborhood-based collaborative filtering , 2012, CIKM.

[13]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[14]  Wanlei Zhou,et al.  Learning Rating Patterns for Top-N Recommendations , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[15]  Gang Li,et al.  K-Complex Detection Using a Hybrid-Synergic Machine Learning Method , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[17]  Jun Zhang,et al.  Network Traffic Classification Using Correlation Information , 2013, IEEE Transactions on Parallel and Distributed Systems.

[18]  Thomas M. Cover,et al.  Estimation by the nearest neighbor rule , 1968, IEEE Trans. Inf. Theory.

[19]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.