Early prediction of the future popularity of uploaded videos

Abstract Predicting the popularity of videos on video sharing sites is important for the formulation of online advertising strategies and commercial marketing. The predicted popularity value can help the system decide which videos to recommend to users and determine the recommended order of related videos. However, the weakness of most previous methods has been that the prediction of video popularity is based on past usage data after upload. Early predictions cannot be made because they are dependent on past access data. To solve this problem, we propose a new method for the early prediction of video popularity based on the data available at the initial time of video upload. This study first uses a supervised learning approach to develop a video popularity prediction model. Then, to further improve the overall accuracy of the prediction, an ensemble model is developed for integration of these classification results to get the most accurate predictions. Empirical evaluation shows that these models can effectively predict the future popularity of videos at the time of uploading.

[1]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[2]  Jiangchuan Liu,et al.  Statistics and Social Network of YouTube Videos , 2008, 2008 16th Interntional Workshop on Quality of Service.

[3]  Jussara M. Almeida,et al.  Using early view patterns to predict the popularity of youtube videos , 2013, WSDM.

[4]  Ibrahim Matta,et al.  Describing and forecasting video access patterns , 2011, 2011 Proceedings IEEE INFOCOM.

[5]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[6]  Tao Mei,et al.  Towards Cross-Domain Learning for Social Video Popularity Prediction , 2013, IEEE Transactions on Multimedia.

[7]  Huiqiang Wang,et al.  Boosting video popularity through recommendation systems , 2011, DBSocial '11.

[8]  Andrew B. Whinston,et al.  Two Formulas for Success in Social Media: Learning and Network Effects , 2015, J. Manag. Inf. Syst..

[9]  Xintao Hu,et al.  Predicting Movie Trailer Viewer's “Like/Dislike” via Learned Shot Editing Patterns , 2016, IEEE Transactions on Affective Computing.

[10]  Zachary Estes,et al.  Using Latent Semantic Analysis to Estimate Similarity , 2006 .

[11]  Michael Timmers,et al.  On the Use of Reservoir Computing in Popularity Prediction , 2010, 2010 2nd International Conference on Evolving Internet.

[12]  Van-Nam Huynh,et al.  Using community preference for overcoming sparsity and cold-start problems in collaborative filtering system offering soft ratings , 2017, Electron. Commer. Res. Appl..

[13]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[14]  Cheng-Te Li,et al.  Exploiting concept drift to predict popularity of social multimedia in microblogs , 2016, Inf. Sci..

[15]  Ravindra C. Thool,et al.  Intrusion Detection System Using Bagging Ensemble Method of Machine Learning , 2015, 2015 International Conference on Computing Communication Control and Automation.

[16]  Ji-Rong Wen,et al.  Characterizing and Predicting Early Reviewers for Effective Product Marketing on E-Commerce Websites , 2018, IEEE Transactions on Knowledge and Data Engineering.

[17]  Michalis Faloutsos,et al.  A First Step Towards Understanding Popularity in YouTube , 2010, 2010 INFOCOM IEEE Conference on Computer Communications Workshops.

[18]  Elena Pagani,et al.  A personalized recommender system for pervasive social networks , 2017, Pervasive Mob. Comput..

[19]  Jiangchuan Liu,et al.  Understanding the Characteristics of Internet Short Video Sharing: A YouTube-Based Measurement Study , 2013, IEEE Transactions on Multimedia.

[20]  Luciano Sbaiz,et al.  Finding meaning on YouTube: Tag recommendation and category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Markus Koch,et al.  Content analysis meets viewers: linking concept detection with demographics on YouTube , 2012, International Journal of Multimedia Information Retrieval.

[22]  Flavio Figueiredo,et al.  The tube over time: characterizing popularity growth of youtube videos , 2011, WSDM '11.

[23]  Junghoo Cho,et al.  Generating advertising keywords from video content , 2010, CIKM '10.

[24]  Erjia Yan,et al.  Science communication and dissemination in different cultures: An analysis of the audience for TED videos in China and abroad , 2016, J. Assoc. Inf. Sci. Technol..

[25]  Mike Thelwall,et al.  Commenting on YouTube videos: From guatemalan rock to El Big Bang , 2012, J. Assoc. Inf. Sci. Technol..

[26]  Mykola Pechenizkiy,et al.  Twitter rumour detection in the health domain , 2018, Expert Syst. Appl..

[27]  C. Flavián,et al.  Understanding Interactive Online Advertising: Congruence and Product Involvement in Highly and Lowly Arousing, Skippable Video Ads , 2017 .

[28]  Yong Tan,et al.  Influentials, Imitables, or Susceptibles? Virality and Word-of-Mouth Conversations in Online Social Networks , 2016, J. Manag. Inf. Syst..

[29]  Bernard Zenko,et al.  Is Combining Classifiers Better than Selecting the Best One , 2002, ICML.

[30]  Jun Yu,et al.  Boosting video popularity through keyword suggestion and recommendation systems , 2016, Neurocomputing.

[31]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[32]  Kalevi Kilkki,et al.  Features as predictors of phone popularity: An analysis of trends and structural breaks , 2016, Telematics Informatics.

[33]  Laura Ricci,et al.  The impact of user's availability on On-line Ego Networks: a Facebook analysis , 2016, Comput. Commun..

[34]  Rose Kayee D. Pecay YouTube Integration in Science Classes: Understanding Its Roots, Ways, and Selection Criteria , 2017 .

[35]  Albert A. Barreda,et al.  Consumer perception of knowledge-sharing in travel-related Online Social Networks , 2016 .

[36]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[37]  Susan T. Dumais,et al.  Latent Semantic Indexing (LSI) and TREC-2 , 1993, TREC.

[38]  Tat-Seng Chua,et al.  Unifying Virtual and Physical Worlds , 2017, ACM Trans. Inf. Syst..

[39]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[40]  Neil Towers,et al.  Understanding how Millennial shoppers decide what to buy: digitally connected unseen journeys , 2018 .

[41]  Stuart M. Allen,et al.  Personality and location-based social networks , 2015, Comput. Hum. Behav..

[42]  Mirjam Wattenhofer,et al.  YouTube around the world: geographic popularity of videos , 2012, WWW.

[43]  Hamit Erdal,et al.  Bagging ensemble models for bank profitability: An emprical research on Turkish development and investment banks , 2016, Appl. Soft Comput..

[44]  Bernardo A. Huberman,et al.  The Pulse of News in Social Media: Forecasting Popularity , 2012, ICWSM.

[45]  Rebecca Matson,et al.  Barriers to use: usability and content accessibility on the Web's most popular sites , 2000, CUU '00.

[46]  Daniel McDuff,et al.  Predicting Ad Liking and Purchase Intent: Large-Scale Analysis of Facial Responses to Ads , 2014, IEEE Transactions on Affective Computing.

[47]  Kai Sassenberg,et al.  Kicking out the trolls - Antecedents of social exclusion intentions in Facebook groups , 2017, Comput. Hum. Behav..

[48]  Nagaraj Vernekar,et al.  Anemia detection using ensemble learning techniques and statistical models , 2016, 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT).

[49]  Yong Tan,et al.  Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube , 2012, Inf. Syst. Res..

[50]  S. A. Baumgarten The Innovative Communicator in the Diffusion Process , 1975 .

[51]  R. Venkatesha Prasad,et al.  Toward the Development of a Techno-Social Smart Grid , 2016, IEEE Communications Magazine.

[52]  Bernard Burnes,et al.  How consumers contribute to the development and continuity of a cultural market , 2016 .

[53]  Mike Thelwall,et al.  Scholars on soap boxes: Science communication and dissemination in TED videos , 2013, J. Assoc. Inf. Sci. Technol..

[54]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[55]  Kai Chen,et al.  Collaborative filtering and deep learning based recommendation system for cold start items , 2017, Expert Syst. Appl..

[56]  Kavé Salamatian,et al.  An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[57]  Noël Crespi,et al.  Alike people, alike interests? Inferring interest similarity in online social networks , 2015, Decis. Support Syst..

[58]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[59]  Flavio Figueiredo,et al.  On the Dynamics of Social Media Popularity: A YouTube Case Study , 2014, TOIT.

[60]  Mohammad Hossein Fazel Zarandi,et al.  A new Dirichlet process for mining dynamic patterns in functional data , 2017, Inf. Sci..

[61]  Uzair Ahmad,et al.  HarVis: An integrated social media content analysis framework for YouTube platform , 2017, Inf. Syst..

[62]  Virgílio A. F. Almeida,et al.  Equal but different: a contextual analysis of duplicated videos on YouTube , 2010, Journal of the Brazilian Computer Society.

[63]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[64]  Dustin J. Welbourne,et al.  Science communication on YouTube: Factors that affect channel and video popularity , 2016, Public understanding of science.

[65]  Flavio Figueiredo,et al.  TrendLearner: Early prediction of popularity trends of user generated content , 2014, Inf. Sci..