Re-considering neighborhood-based collaborative filtering parameters in the context of new data

The Movielens dataset and the Herlocker et al. study of 1999 have been very influential in collaborative filtering. Yet, the age of both invites re-examining their applicability. We use Netflix challenge data to re-visit the prior results. In particular, we re-evaluate the parameters of Herlocker et al.'s method on two critical factors: measuring similarity between users and normalizing the ratings of the users. We find that normalization plays a significant role and that Pearson Correlation is not necessarily the best similarity metric.