Predicting and Evaluating the Popularity of Online News

Reading and sharing online news has become an important part of people’s entertainment lives. Therefore it would be greatly helpful if we could accurately predict the popularity of news prior to its publication for social media workers (authors, advertisers, etc.). Our goal is to predict the popularity of a news post (measured by number of shares) based on various features (see Table I.). In this project, we attempted to apply linear regression, logistic regression, decision tree, SVMs, kNNs, KPLS and SVR (with different parameters tested) algorithms to make predictions (to classify the case as “popular" or “unpopular"). We also used PCA, forward/backward selection and Fisher correlation scores to select features and we compared our results in detail. DATA AND FEATURES