Predicting the Performance of Collaborative Filtering Algorithms

Collaborative Filtering algorithms are widely used in recommendation engines, but their performance varies widely. How to predict whether collaborative filtering is appropriate for a specific recommendation environment without running the algorithm on the dataset, nor designing experiments? We propose a method that estimates the expected performance of CF algorithms by analysing only the dataset statistics. In particular, we introduce measures that quantify the dataset properties with respect to user co-ratings, and we show that these measures predict the performance of collaborative filtering on the dataset, when trained on a small number of benchmark datasets.