Predicting Popularity of Movie Using Support Vector Machines

There are many movies performed, from low until high rating, which is the movie maybe popular or not popular. If many people watched that movie maybe it is popular, in other hand if a movie is watched by a little person so that movie can called as not popular movie. Popularity of movie can determined by several factors, such as likes, ratings, comments, etc. To determine popular or not popular of movie based on features, will use two classification methods that is logistic regression and Support Vector Machine (SVM). In this research, the data are Conventional and Social Media Movies Dataset 2014 and 2015. To get the best model and without ignoring the principle of parsimony, will do feature selection. The selected features are genre, sentiment, likes, and comments. That features will be used to classify the popularity of movies. This research used two classification methods namely logistic regression and Support Vector Machine (SVM). When used logistic regression, the accuracy is 77.29%, while used SVM the accuracy is 83.78%. Based on the accuracy of both methods, it is found that SVM gives the highest accuracy for CSM dataset. The highest accuracy is obtained from the SVM method with non-stratified holdout training-testing strategy.