Reading and sharing online news has become an important part of people’s entertainment lives. Therefore it would be greatly helpful if we could accurately predict the popularity of news prior to its publication for social media workers (authors, advertisers, etc.). Our goal is to predict the popularity of a news post (measured by number of shares) based on various features (see Table I.). In this project, we attempted to apply linear regression, logistic regression, decision tree, SVMs, kNNs, KPLS and SVR (with different parameters tested) algorithms to make predictions (to classify the case as “popular" or “unpopular"). We also used PCA, forward/backward selection and Fisher correlation scores to select features and we compared our results in detail. DATA AND FEATURES
[1]
Nello Cristianini,et al.
Modelling and predicting news popularity
,
2012,
Pattern Analysis and Applications.
[2]
Serge Fdida,et al.
Predicting the popularity of online articles based on user comments
,
2011,
WIMS '11.
[3]
Paulo Cortez,et al.
A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News
,
2015,
EPIA.
[4]
Trevor Hastie,et al.
An Introduction to Statistical Learning
,
2013,
Springer Texts in Statistics.
[5]
Chih-Jen Lin,et al.
LIBSVM: A library for support vector machines
,
2011,
TIST.