Online News Popularity Prediction

Working with data mining algorithms in the large dataset is very common and especially with the expansion of the online news, it became very useful. Neural Networks, Random Forest, Support Vector Machines (SVM), Naïve Bayes and others are the most common mining algorithms used for classification. In this research, we aimed to find the best model and set of features to predict the popularity of online news, using machine-learning techniques and implement various data mining algorithms on the selected features. The data source was Mashable, a well-known online news website. Precision, Recall, and F-measure were used to evaluate the results and their results were compared to find the better one. In addition, we compared with previous works on the same dataset. Random Forest and Neural Network turn out to be the best model for prediction, and both of them can achieve an accuracy of 65% with optimal parameters. Our work can help online news companies to predict news popularity before publication.