Predicting Movie Ratings with Machine Learning Algorithms

The fact that a film is a hedonic product makes it difficult to assess its quality before consumption, therefore consumers who want to reduce uncertainty need various quality signals in their decision-making processes. In recent years, adding to movie-related information, user reviews or ratings on online movie databases have become important quality signals, where many movie viewers use these sites to decide which movie to watch or whether or not to watch a certain movie. In this study, it is attempted to estimate the rating and popularity of a movie by using the main product features as the origin, production year, actor and plot. A database containing 8943 movies shot between 2000 and 2019 from the website sinemalar.com is formed with the help of a web crawler Latent Dirichlet allocation topic extraction is applied to plots and assigned topics obtained from LDA analyzes, along with other movie-related attributes are used to predict the rating class and popularity class of a movie by employing machine learning algorithms such as random forest, gradient boosting tree and decision tree. Using the random forest algorithm attribute statistics, based on their contribution to the predictive power of the model the relative variable importance is also examined.

[1]  Ping-Yu Hsu,et al.  Predicting Movies User Ratings with Imdb Attributes , 2014, RSKT.

[2]  M. Lorenzen Internationalization vs. Globalization of the Film Industry , 2007 .

[3]  Mustafa Sert,et al.  Movie rating prediction using ensemble learning and mixed type attributes , 2017, 2017 25th Signal Processing and Communications Applications Conference (SIU).

[4]  Yiu-Kai Ng,et al.  Movie Recommendations Using the Deep Learning Approach , 2018, 2018 IEEE International Conference on Information Reuse and Integration (IRI).

[5]  Michela Addis,et al.  Art versus commerce in the movie industry: a Two-Path Model of Motion-Picture Success , 2008 .

[6]  M. de Rijke,et al.  Predicting IMDB Movie Ratings Using Social Media , 2012, ECIR.

[7]  Peeter W. J. Verlegh,et al.  A review and meta-analysis of country-of-origin research , 1999 .

[8]  Dan Lovallo,et al.  Robust analogizing and the outside view: two empirical tests of case‐based decision making , 2012 .

[9]  Sushant S. Khopkar,et al.  Predicting long-term product ratings based on few early ratings and user base analysis , 2017, Electron. Commer. Res. Appl..

[10]  C. Guan,et al.  Winning box office with the right movie synopsis , 2020 .

[11]  M. Holbrook,et al.  The role of actors and actresses in the success of films: how much is a movie star worth? , 1993 .

[12]  Taegu Kim,et al.  Box office forecasting using machine learning algorithms based on SNS data , 2015 .

[13]  Jalal Mahmud,et al.  Predicting Movie Genre Preferences from Personality and Values of Social Media Users , 2017, ICWSM.

[14]  Shuai Zhang,et al.  Rating prediction via generative convolutional neural networks based regression , 2020, Pattern Recognit. Lett..

[15]  Kang Zhao,et al.  Early Predictions of Movie Success: The Who, What, and When of Profitability , 2015, J. Manag. Inf. Syst..

[16]  Aaron Gazley,et al.  Understanding preferences for motion pictures , 2011 .

[17]  J. Eliashberg,et al.  A Parsimonious Model for Forecasting Gross Box-Office Revenues of Motion Pictures , 1996 .

[18]  J. Eliashberg,et al.  The Motion Picture Industry: Critical Issues in Practice, Current Research, and New Research Directions , 2006 .

[19]  J. Waldfogel Cinematic Explosion: New Products, Unpredictabilty and Realized Quality in the Digital Era , 2016 .