Predicting the popularity of web 2.0 items based on user comments

In the current Web 2.0 era, the popularity of Web resources fluctuates ephemerally, based on trends and social interest. As a result, content-based relevance signals are insufficient to meet users' constantly evolving information needs in searching for Web 2.0 items. Incorporating future popularity into ranking is one way to counter this. However, predicting popularity as a third party (as in the case of general search engines) is difficult in practice, due to their limited access to item view histories. To enable popularity prediction externally without excessive crawling, we propose an alternative solution by leveraging user comments, which are more accessible than view counts. Due to the sparsity of comments, traditional solutions that are solely based on view histories do not perform well. To deal with this sparsity, we mine comments to recover additional signal, such as social influence. By modeling comments as a time-aware bipartite graph, we propose a regularization-based ranking algorithm that accounts for temporal, social influence and current popularity factors to predict the future popularity of items. Experimental results on three real-world datasets --- crawled from YouTube, Flickr and Last.fm --- show that our method consistently outperforms competitive baselines in several evaluation tasks.

[1]  Jussara M. Almeida,et al.  Using early view patterns to predict the popularity of youtube videos , 2013, WSDM.

[2]  Tad Hogg,et al.  Using a model of social dynamics to predict popularity of news , 2010, WWW '10.

[3]  Ee-Peng Lim,et al.  Comments-oriented document summarization: understanding documents with readers' feedback , 2008, SIGIR '08.

[4]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[5]  Keith B. Hall,et al.  Improved video categorization from text metadata and user comments , 2011, SIGIR '11.

[6]  Bernhard Schölkopf,et al.  Regularization on Discrete Spaces , 2005, DAGM-Symposium.

[7]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[8]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[9]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[10]  Gilad Mishne,et al.  Leave a Reply: An Analysis of Weblog Comments , 2006 .

[11]  Xue Li,et al.  Time weight collaborative filtering , 2005, CIKM '05.

[12]  Junghoo Cho,et al.  Incorporating popularity in topic models for social network analysis , 2013, SIGIR.

[13]  Tao Chen,et al.  Re-tweeting from a linguistic perspective , 2012 .

[14]  Flavio Figueiredo,et al.  The tube over time: characterizing popularity growth of youtube videos , 2011, WSDM '11.

[15]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[16]  Nuria Oliver,et al.  Leveraging user comments for aesthetic aware image search reranking , 2012, WWW.

[17]  Susan T. Dumais,et al.  Modeling and predicting behavioral dynamics on the web , 2012, WWW.

[18]  Stéphane Bressan,et al.  A random walk on the red carpet: rating movies with user reviews and pagerank , 2008, CIKM '08.

[19]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[20]  Huzefa Rangwala,et al.  Digging Digg: Comment Mining, Popularity Prediction, and Social Network Analysis , 2009, 2009 International Conference on Web Information Systems and Mining.

[21]  Himabindu Lakkaraju,et al.  Attention prediction on social media brand pages , 2011, CIKM '11.

[22]  Yehuda Koren,et al.  Care to comment?: recommendations for commenting on news stories , 2012, WWW.

[23]  Virgílio A. F. Almeida,et al.  On Popularity in the Blogosphere , 2010, IEEE Internet Computing.

[24]  Edith Cohen,et al.  Maintaining time-decaying stream aggregates , 2006, J. Algorithms.

[25]  Frank McSherry,et al.  A uniform approach to accelerated PageRank computation , 2005, WWW '05.

[26]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[27]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[28]  Mirella Lapata,et al.  Tweet Recommendation with Graph Co-Ranking , 2012, ACL.

[29]  Guokun Lai,et al.  Explicit factor models for explainable recommendation based on phrase-level sentiment analysis , 2014, SIGIR.

[30]  Wang-Chien Lee,et al.  A straw shows which way the wind blows: ranking potentially popular items from early votes , 2012, WSDM '12.

[31]  Ismail Sengör Altingövde,et al.  Can Social Features Help Learning to Rank YouTube Videos? , 2012, WISE.

[32]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[33]  Chris Chatfield,et al.  The Analysis of Time Series : An Introduction, Sixth Edition , 2003 .

[34]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[35]  Saverio Niccolini,et al.  A peek into the future: predicting the evolution of popularity in user generated content , 2013, WSDM.

[36]  Serge Fdida,et al.  Predicting the popularity of online articles based on user comments , 2011, WIMS '11.

[37]  Michael R. Lyu,et al.  A generalized Co-HITS algorithm and its application to bipartite graphs , 2009, KDD.

[38]  Min-Yen Kan,et al.  Comment-based multi-view clustering of web 2.0 items , 2014, WWW.