Analysis of User Behavior in a Large-Scale VoD System

Understanding streaming user behavior is crucial to the design of large-scale video-on-demand (VoD) systems. However, existing studies usually treat all the users as an entire entity to analyze the collective user behavior. In this paper, we measure the individual viewing behavior of 10 million sampled users from two perspectives: the temporal characteristics and the user interest, and present our results by dividing users into the active and inactive groups. We observe that the active users spend more hours on each active day, and their daily request time distribution is more scattered than that of the inactive users, while the inter-viewing time distribution differs negligible between two groups. We exhibit the similar viewing behaviors of the active and inactive users, e.g. the common interests in popular videos and the latest uploaded videos. We further propose a modified Weibull distribution to fit users' view completion rate, which can deal with different video categories well. To identify users with similar viewing behaviors, we cluster them into 24 classes using their daily request timestamp or 11 classes using the watched video category. The analysis of cluster centroid manifests the efficacy of the clustering, which enables us to step closer to the understanding of user behavior in large-scale VoD systems.

[1]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[2]  Michalis Faloutsos,et al.  A First Step Towards Understanding Popularity in YouTube , 2010, 2010 INFOCOM IEEE Conference on Computer Communications Workshops.

[3]  Gaogang Xie,et al.  User Behavior Characterization of a Large-scale Mobile Live Streaming System , 2015, WWW.

[4]  Xiang Li,et al.  Towards a temporal network analysis of interactive WiFi users , 2012, ArXiv.

[5]  Ben Y. Zhao,et al.  Understanding user behavior in large-scale video-on-demand systems , 2006, EuroSys.

[6]  Kwang-Il Goh,et al.  Burstiness and memory in complex systems , 2006 .

[7]  Bo Hu,et al.  Modeling Buffer Starvations of Video Streaming in Cellular Networks with Large-Scale Measurement of User Behavior , 2017, IEEE Transactions on Mobile Computing.

[8]  Chelsea Dobbins,et al.  Scalable Daily Human Behavioral Pattern Mining from Multivariate Temporal Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[9]  Claude E. Shannon Entropy (information theory) , 2001 .

[10]  Paolo Giaccone,et al.  Unravelling the Impact of Temporal and Geographical Locality in Content Caching Systems , 2015, IEEE Transactions on Multimedia.

[11]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[12]  Yipeng Zhou,et al.  Video Browsing - A Study of User Behavior in Online VoD Services , 2013, 2013 22nd International Conference on Computer Communication and Networks (ICCCN).

[13]  Tao Mei,et al.  Towards Cross-Domain Learning for Social Video Popularity Prediction , 2013, IEEE Transactions on Multimedia.