Tuning machine learning algorithms for content-based movie recommendation

Machine learning algorithms are often used in content-based recommender systems since a recommendation task can naturally be reduced to a classification problem: A recommender needs to learn a classifier for a given user where learning examples are characteristics of items previously liked/bought/seen by the user. However, multi-valued and continuous attributes require special approaches for classifier implementation as they can significantly influence classifier accuracy. In this paper we propose novel approaches for handling multi- valued and continuous attributes adequate for the naive Bayes classifier and decision trees classifier, and tune it for content-based movie recommendation. We evaluate the performance of the resulting approaches using the MovieLens data set enriched with movie details retrieved from the Internet Movie Database. Our empirical results demonstrate that the naive Bayes classifier is more suitable for content-based movie recommendation than the decision trees algorithm. In addition, the naive Bayes classifier achieves better results with smart discretization of continuous attributes compared to the approach which models continuous attributes with a Gaussian distribution. Finally, we combine our best performing content-based algorithm with the k-means clustering algorithm typically used for collaborative filtering, and evaluate the performance of the resulting hybrid approach for a movie recommendation task. The experimental results clearly show that the hybrid approach significantly increases recommendation accuracy compared to collaborative filtering while reducing the risk of over specification, which is a typical problem of content-based approaches.

[1]  John Riedl,et al.  Recommender Systems for Large-scale E-Commerce : Scalable Neighborhood Formation Using Clustering , 2002 .

[2]  Andreas Stafylopatis,et al.  A hybrid movie recommender system based on neural networks , 2005, 5th International Conference on Intelligent Systems Design and Applications (ISDA'05).

[3]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  George A. Tsihrintzis,et al.  A Cascade-Hybrid Music Recommender System for mobile services based on musical genre classification and personality diagnosis , 2011, Multimedia Tools and Applications.

[5]  Benjamin M. Marlin,et al.  Collaborative Filtering: A Machine Learning Perspective , 2004 .

[6]  Daniel Nikovski,et al.  Induction of compact decision trees for personalized recommendation , 2006, SAC.

[7]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[8]  Lakhmi C. Jain,et al.  Multimedia Services in Intelligent Environments: Recommendation Services , 2013 .

[9]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[10]  Robin D. Burke,et al.  Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[11]  Nan Li,et al.  MovieGEN : A Movie Recommendation System , 2008 .

[12]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[13]  Dean P. Foster,et al.  Clustering Methods for Collaborative Filtering , 1998, AAAI 1998.

[14]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.