Analysis and characterization of a video-on-demand service workload

Video-on-Demand (VoD) and video sharing services account for a large percentage of the total downstream Internet traffic. In order to provide a better understanding of the load on these services, we analyze and model a workload trace from a VoD service provided by a major Swedish TV broadcaster. The trace contains over half a million requests generated by more than 20000 unique users. Among other things, we study the request arrival rate, the inter-arrival time, the spikes in the workload, the video popularity distribution, the streaming bit-rate distribution and the video duration distribution. Our results show that the user and the session arrival rates for the TV4 workload does not follow a Poisson process. The arrival rate distribution is modeled using a lognormal distribution while the inter-arrival time distribution is modeled using a stretched exponential distribution. We observe the "impatient user" behavior where users abandon streaming sessions after minutes or even seconds of starting them. Both very popular videos and non-popular videos are particularly affected by impatient users. We investigate if this behavior is an invariant for VoD workloads.

[1]  Gabriel Rilling,et al.  On empirical mode decomposition and its algorithms , 2003 .

[2]  D. Sornette,et al.  Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales , 1998, cond-mat/9801293.

[3]  Johan Tordsson,et al.  Measuring Cloud Workload Burstiness , 2014, 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing.

[4]  Petter Svärd,et al.  Evaluation of delta compression techniques for efficient live migration of large virtual machines , 2011, VEE '11.

[5]  John A. Silvester,et al.  Analyzing and Modeling Workload Characteristics in a Multiservice IP Network , 2011, IEEE Internet Computing.

[6]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[7]  Vanish Talwar,et al.  Monalytics: online monitoring and analytics for managing large scale data centers , 2010, ICAC '10.

[8]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[9]  Gabriel Rilling,et al.  Empirical mode decomposition as a filter bank , 2004, IEEE Signal Processing Letters.

[10]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[11]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[12]  Andrzej Kochut,et al.  Dynamic Placement of Virtual Machines for Managing SLA Violations , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[13]  Johan Tordsson,et al.  Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control , 2012, ScienceCloud '12.

[14]  Johan Tordsson,et al.  How will Your Workload Look Like in 6 Years? Analyzing Wikimedia's Workload , 2014, 2014 IEEE International Conference on Cloud Engineering.

[15]  Jiangchuan Liu,et al.  Statistics and Social Network of YouTube Videos , 2008, 2008 16th Interntional Workshop on Quality of Service.

[16]  Ramesh K. Sitaraman,et al.  Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Designs , 2012, IEEE/ACM Transactions on Networking.

[17]  Norden E. Huang,et al.  A review on Hilbert‐Huang transform: Method and its applications to geophysical studies , 2008 .

[18]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[19]  Martin Arlitt,et al.  A workload characterization study of the 1998 World Cup Web site , 2000, IEEE Netw..

[20]  Norden E. Huang,et al.  Ensemble Empirical Mode Decomposition: a Noise-Assisted Data Analysis Method , 2009, Adv. Data Sci. Adapt. Anal..

[21]  Shudong Jin,et al.  GISMO: Generator of Streaming Media Objects and Workloads , 2001, SIGMETRICS 2001.

[22]  Donald F. Towsley,et al.  Supplying instantaneous video-on-demand services using controlled multicast , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[23]  Johan Tordsson,et al.  The Challenge of Cloud Control , 2013, Feedback Computing.

[24]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[25]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[26]  Joonwon Lee,et al.  Workload Characterization and Performance Implications of Large-Scale Blog Servers , 2012, TWEB.

[27]  Fang Hao,et al.  Unreeling netflix: Understanding and improving multi-CDN movie delivery , 2012, 2012 Proceedings IEEE INFOCOM.

[28]  Lada A. Adamic Zipf, Power-laws, and Pareto-a ranking tutorial , 2000 .

[29]  Haifeng Chen,et al.  Understanding internet video sharing site workload: a view from data center design , 2008, WWW.

[30]  Cheng Huang,et al.  Can internet video-on-demand be profitable? , 2007, SIGCOMM '07.

[31]  Zongpeng Li,et al.  Youtube traffic characterization: a view from the edge , 2007, IMC '07.

[32]  Carsten Griwodz,et al.  Analysis of a real-world HTTP segment streaming case , 2013, EuroITV.

[33]  Britt-Marie Ringfjord Learning to become a football star : Representations of football fan culture in Swedish public service television for youth , 2012 .

[34]  Vyas Sekar,et al.  Understanding the impact of video quality on user engagement , 2011, SIGCOMM.

[35]  Carsten Griwodz,et al.  Workload Characterization for News-on-Demand Streaming Services , 2007, 2007 IEEE International Performance, Computing, and Communications Conference.

[36]  J. Tukey,et al.  Variations of Box Plots , 1978 .

[37]  Ernst W. Biersack,et al.  A longitudinal view of HTTP video streaming performance , 2012, MMSys '12.

[38]  Baochun Li,et al.  Quality-assured cloud bandwidth auto-scaling for video-on-demand applications , 2012, 2012 Proceedings IEEE INFOCOM.

[39]  Azer Bestavros,et al.  GISMO: a Generator of Internet Streaming Media Objects and workloads , 2001, PERV.

[40]  Johan Tordsson,et al.  An Autonomic Approach to Risk-Aware Data Center Overbooking , 2014, IEEE Transactions on Cloud Computing.

[41]  Michael Zink,et al.  Watching user generated videos with prefetching , 2011, MMSys.

[42]  Prashant J. Shenoy,et al.  Empirical evaluation of latency-sensitive application performance in the cloud , 2010, MMSys '10.

[43]  Ben Y. Zhao,et al.  Understanding user behavior in large-scale video-on-demand systems , 2006, EuroSys.

[44]  Gang Lu,et al.  Characterization of real workloads of web search engines , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).