Popularity Dynamics and Intrinsic Quality in Reddit and Hacker News

In this paper we seek to understand the relationship between the online popularity of an article and its intrinsic quality. Prior experimental work suggests that the relationship between quality and popularity can be very distorted due to factors like social influence bias and inequality in visibility. We conduct a study of popularity on two different social news aggregators, Reddit and Hacker News. We define quality as the number of votes an article would have received if each article was shown, in a bias-free way, to an equal number of users. We propose a simple Poisson regression method to estimate this quality metric from time-series voting data. We validate our methods on data from Reddit and Hacker News, as well the experimental data from prior work. Using these estimates, we find that popularity on Reddit and Hacker News is a relatively strong reflection of intrinsic quality.

[1]  Jure Leskovec,et al.  What's in a Name? Understanding the Interplay between Titles, Content, and Communities in Social Media , 2013, ICWSM.

[2]  Chun Liu,et al.  Social Influence Bias : A Randomized Experiment , 2014 .

[3]  Galen Pickard,et al.  Quantifying Social Influence in an Online Cultural Market , 2012, PloS one.

[4]  Ye Chen,et al.  Position-normalized click prediction in search advertising , 2012, KDD.

[5]  Matthew J. Salganik,et al.  Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market , 2006, Science.

[6]  Lada A. Adamic,et al.  The role of social networks in information diffusion , 2012, WWW.

[7]  Tad Hogg,et al.  Effects of Social Influence in Peer Online Recommendation , 2014, ArXiv.

[8]  Jussara M. Almeida,et al.  Using early view patterns to predict the popularity of youtube videos , 2013, WSDM.

[9]  Cliff Lampe,et al.  Follow the (slash) dot: effects of feedback on new members in an online community , 2005, GROUP.

[10]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Thorsten Joachims,et al.  Was this review helpful to you?: it depends! context and voting patterns in online content , 2014, WWW.

[13]  Sanmay Das,et al.  The effects of feedback on human behavior in social media: an inverse reinforcement learning model , 2014, AAMAS.

[14]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[15]  Eric Gilbert,et al.  Widespread underprovision on Reddit , 2013, CSCW.

[16]  Paul Resnick,et al.  Slash(dot) and burn: distributed moderation in a large online conversation space , 2004, CHI.

[17]  Fei Wang,et al.  Quantifying herding effects in crowd wisdom , 2014, KDD.

[18]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[19]  Tad Hogg,et al.  Stochastic Models of User-Contributory Web Sites , 2009, ICWSM.

[20]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[21]  Tad Hogg,et al.  Using a model of social dynamics to predict popularity of news , 2010, WWW '10.

[22]  Daniel G. Goldstein,et al.  The structure of online diffusion networks , 2012, EC '12.

[23]  Alex Leavitt,et al.  Upvoting hurricane Sandy: event-based news production processes on a social news site , 2014, CHI.

[24]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[25]  Justin Cheng,et al.  Rumor Cascades , 2014, ICWSM.

[26]  Matthew J. Salganik,et al.  Leading the Herd Astray: An Experimental Study of Self-fulfilling Prophecies in an Artificial Cultural Market , 2008, Social psychology quarterly.

[27]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[28]  Kristina Lerman,et al.  Leveraging Position Bias to Improve Peer Recommendation , 2014, PloS one.

[29]  Bernardo A. Huberman,et al.  The Pulse of News in Social Media: Forecasting Popularity , 2012, ICWSM.