The literature on recommendation systems indicates that the choice of the methodology significantly influences the quality of recommendations. The impact of the amount of available data on the performance of recommendation systems has not been systematically investigated. The authors study different approaches to recommendation systems using the publicly available EachMovie data set containing ratings for movies and videos. In contrast to previous work on this data set, here a significantly larger subset is used. The effects caused by the available number of customers and movies as well as their interaction with different methods are investigated. Two commonly used collaborative filtering approaches are compared with several regression models using an experimental full factorial design. According to the findings, the number of customers significantly influences the performance of all approaches under study. For a large number of customers and movies, it is shown that simple linear regression with model selection can provide significantly better recommendations than collaborative filtering. From a managerial perspective, this gives suggestions about the selection of the model to be used depending on the amount of data available. Furthermore, the impact of an enlargement of the customer database on the quality of recommendations is shown.
[1]
Nicholas Negroponte,et al.
The architecture machine
,
1975,
Comput. Aided Des..
[2]
Edward I. George,et al.
A bayesian model for collaborative filtering
,
1999,
AISTATS.
[3]
John Riedl,et al.
Combining Collaborative Filtering with Personal Agents for Better Recommendations
,
1999,
AAAI/IAAI.
[4]
Pattie Maes,et al.
Social information filtering: algorithms for automating “word of mouth”
,
1995,
CHI '95.
[5]
David Heckerman,et al.
Empirical Analysis of Predictive Algorithms for Collaborative Filtering
,
1998,
UAI.
[6]
Paul Resnick,et al.
Recommender systems
,
1997,
CACM.