Comparative evaluation of recommender system quality

Several researchers suggest that the Recommendation Systems (RSs) that are the "best" according to statistical metrics might not be the most satisfactory for the user. We explored this issue through an empirical study that involved 210 users and considered 7 RSs using different recommender algorithms on the same dataset. We measured user's perceived quality of each RS, and compared these results against measures of statistical quality of the considered algorithms as they have been assessed by past studies in the field, highlighting some interesting results.