Fast Recommendations With the M-Distance

Memory-based recommender systems with m users and n items typically require O(mn) space to store the rating information. In item-based collaborative filtering (CF) algorithms, the feature vector of each item has length m,and it takes O(m) time to compute the similarity between two items using the Pearson or cosine distances. In this paper, we propose an efficient CF algorithm based on a new measure called the M-distance, which is defined as the difference between the average ratings of two items. In the initialization stage, we compute the average ratings of items and store them in two vectors, which requires O(m) space. Scanning the rating dataset then takes O(mn) time. In the online prediction stage, a threshold δ is employed to identify similar items. To predictp ratings, our algorithm requires O(np) time compared with the O(mnp) time of the cosine-based kNN algorithm. Experiments are undertaken on four well-known datasets, and the proposed M-distance is compared with the cosine-based kNN, Pearson-based kNN, and Slope One methods. Our results show that the new algorithm is significantly faster than the conventional techniques, especially for large datasets, and that its prediction ability is no worse in terms of the mean absolute error and root mean square error.

[1]  Ming Ouyang,et al.  Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU , 2009, 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing.

[2]  Kenneth Y. Goldberg,et al.  Eigentaste 5.0: constant-time adaptability in a recommender system using item clustering , 2007, RecSys '07.

[3]  Fan Min,et al.  Three-way recommender systems based on random forests , 2016, Knowl. Based Syst..

[4]  Liangxiao Jiang,et al.  Bayesian Citation-KNN with distance weighting , 2014, Int. J. Mach. Learn. Cybern..

[5]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[6]  Dana Ron,et al.  Algorithmic Stability and Sanity-Check Bounds for Leave-one-Out Cross-Validation , 1997, COLT.

[7]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[8]  Daniel Lemire,et al.  Slope One Predictors for Online Rating-Based Collaborative Filtering , 2007, SDM.

[9]  Yiyu Yao Measuring retrieval effectiveness based on user preference of documents , 1995 .

[10]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[11]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[12]  J. Bobadilla,et al.  Recommender systems survey , 2013, Knowl. Based Syst..

[13]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[14]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[15]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[16]  Dan Frankowski,et al.  Collaborative Filtering Recommender Systems , 2007, The Adaptive Web.

[17]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[18]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.