Sketching Algorithms for Approximating Rank Correlations in Collaborative Filtering Systems

Collaborative filtering (CF) shares information between users to provide each with recommendations. Previous work suggests using sketching techniques to handle massive data sets in CF systems, but only allows testing whether users have a high proportion of items they have both ranked. We show how to determine the correlation between the rankings of two users, using concise "sketches" of the rankings. The sketches allow approximating Kendall's Tau, a known rank correlation, with high accuracy *** and high confidence 1 *** *** . The required sketch size is logarithmic in the confidence and polynomial in the accuracy.