Optimizing the recency-relevance-diversity trade-offs in non-personalized news recommendations

Online news media sites are emerging as the primary source of news for a large number of users. Due to a large number of stories being published in these media sites, users usually rely on news recommendation systems to find important news. In this work, we focus on automatically recommending news stories to all users of such media websites, where the selection is not influenced by a particular user’s news reading habit. When recommending news stories in such non-personalized manner, there are three basic metrics of interest—recency, importance (analogous to relevance in personalized recommendation) and diversity of the recommended news. Ideally, recommender systems should recommend the most important stories soon after they are published. However, the importance of a story only becomes evident as the story ages, thereby creating a tension between recency and importance. A systematic analysis of popular recommendation strategies in use today reveals that they lead to poor trade-offs between recency and importance in practice. So, in this paper, we propose a new recommendation strategy (called Highest Future-Impact) which attempts to optimize on both the axes. To implement our proposed strategy in practice, we propose two approaches to predict the future-impact of news stories, by using crowd-sourced popularity signals and by observing editorial selection in past news data. Finally, we propose approaches to inculcate diversity in recommended news which can maintain a balanced proportion of news from different news sections. Evaluations over real-world news datasets show that our implementations achieve good performance in recommending news stories.

[1]  Niloy Ganguly,et al.  Editorial Algorithms: Optimizing Recency, Relevance and Diversity for Automated News Curation , 2018, WWW.

[2]  Ed H. Chi,et al.  Language Matters In Twitter: A Large Scale Study , 2011, ICWSM.

[3]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[4]  Matthew Zook,et al.  Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web , 2013 .

[5]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[7]  Darren Gergle,et al.  The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context , 2010, CHI.

[8]  Le Song,et al.  Continuous-Time Influence Maximization for Multiple Items , 2013, ArXiv.

[9]  Virgílio A. F. Almeida,et al.  Characterizing user behavior in online social networks , 2009, IMC '09.

[10]  Julian J. Faraway,et al.  Practical Regression and Anova using R , 2002 .

[11]  Didier Sornette,et al.  Robust dynamic classes revealed by measuring the response function of a social system , 2008, Proceedings of the National Academy of Sciences.

[12]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[13]  Yue Xu,et al.  Time-aware topic recommendation based on micro-blogs , 2012, CIKM.

[14]  Mike Thelwall,et al.  Search engine coverage bias: evidence and possible causes , 2004, Inf. Process. Manag..

[15]  A. Shamsai,et al.  Multi-objective Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[16]  Kazufumi Watanabe,et al.  Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs , 2011, CIKM '11.

[17]  Jie Tang,et al.  Citation count prediction: learning to estimate future citations for literature , 2011, CIKM '11.

[18]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[19]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[20]  Krishna P. Gummadi,et al.  Characterizing Information Diets of Social Media Users , 2015, ICWSM.

[21]  Ciro Cattuto,et al.  Dynamical classes of collective attention in twitter , 2011, WWW.

[22]  H. Bonfadelli,et al.  Mass Media Flow and Differential Growth in Knowledge , 2016 .

[23]  Taha Yasseri,et al.  Circadian Patterns of Wikipedia Editorial Activity: A Demographic Analysis , 2011, PloS one.

[24]  Krishna P. Gummadi,et al.  Dissemination Biases of Social Media Channels: On the Topical Coverage of Socially Shared News , 2016, ICWSM.

[25]  Joemon M. Jose,et al.  "Nobody comes here anymore, it's too crowded"; Predicting Image Popularity on Flickr , 2014, ICMR.

[26]  Jürgen Habermas,et al.  The Public Sphere: An Encyclopedia Article (1964) , 1974 .

[27]  E. Katz The Two-Step Flow of Communication: An Up-To-Date Report on an Hypothesis , 1957 .

[28]  李涛,et al.  Personalized News Recommendation:A Review and an Experimental Investigation , 2011 .

[29]  Jiahui Liu,et al.  Personalized news recommendation based on click behavior , 2010, IUI '10.

[30]  Niloy Ganguly,et al.  Stop Clickbait: Detecting and preventing clickbaits in online news media , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[31]  U. Feige,et al.  Maximizing Non-monotone Submodular Functions , 2011 .

[32]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[33]  Michael S. Horn,et al.  Omnipedia: bridging the wikipedia language gap , 2012, CHI.

[34]  Deepak Agarwal,et al.  Click shaping to optimize multiple objectives , 2011, KDD.

[35]  Vahab S. Mirrokni,et al.  Diversity maximization under matroid constraints , 2013, KDD.

[36]  Bernardo A. Huberman,et al.  The Pulse of News in Social Media: Forecasting Popularity , 2012, ICWSM.

[37]  Júlio Cesar dos Reis,et al.  Breaking the News: First Impressions Matter on Online News , 2015, ICWSM.

[38]  Matthew J. Salganik,et al.  Leading the Herd Astray: An Experimental Study of Self-fulfilling Prophecies in an Artificial Cultural Market , 2008, Social psychology quarterly.

[39]  Krishna P. Gummadi,et al.  Optimizing the Recency-Relevancy Trade-off in Online News Recommendations , 2017, WWW.

[40]  Chun Liu,et al.  Social Influence Bias : A Randomized Experiment , 2014 .

[41]  Flavio Figueiredo,et al.  On the Dynamics of Social Media Popularity: A YouTube Case Study , 2014, TOIT.

[42]  Boi Faltings,et al.  Predicting Online Performance of News Recommender Systems Through Richer Evaluation Metrics , 2015, RecSys.

[43]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[44]  Krishna P. Gummadi,et al.  Who Makes Trends? Understanding Demographic Biases in Crowdsourced Recommendations , 2017, ICWSM.

[45]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[46]  Marsha L. Richins,et al.  The Role of Evolvement and Opinion Leadership in Consumer Word-Of-Mouth: an Implicit Model Made Explicit , 1988 .

[47]  Balaji Padmanabhan,et al.  SCENE: a scalable two-stage personalized news recommendation system , 2011, SIGIR.

[48]  Virgílio A. F. Almeida,et al.  Traffic Characteristics and Communication Patterns in Blogosphere , 2006, ICWSM.

[49]  Stephen D. Reese,et al.  Journalists as Gatekeepers , 2008, The Handbook of Journalism Studies.

[50]  Jiawei Han,et al.  Citation Prediction in Heterogeneous Bibliographic Networks , 2012, SDM.

[51]  Bernardo A. Huberman,et al.  Rhythms of social interaction: messaging within a massive online network , 2006, ArXiv.