Statistical Analysis of the Workload of a Video Hosting Server

The amount of data hosted by Internet servers and data centers is increasing at a remarkable pace requiring more capable and more efficient servers. However, physical efficiency does not necessarily correlate with computational efficiency. In fact, independent studies reveal that Internet servers are mostly over provisioned and still additional servers are deployed each year. Understanding the characteristics of the workload of servers is an essential step to efficiently manage them. For example, from the workload statistics, it is possible to predict idle or underutilized states and to consolidate workload, so that the idle or underutilized servers can be switched off. In this paper, we systematically analyze the characteristics of video servers – since they are responsible for producing the largest Internet traffic – and provide an insight into the relationship between the statistics pertaining to workload, the size of videos, and service time. We shall show that from the distribution of the video sizes on host servers, it is possible to estimate the distribution of the workload size produced by clients and the distribution of the time required to process individual requests.

[1]  Feng Zhao,et al.  Virtual machine power metering and provisioning , 2010, SoCC '10.

[2]  Yong-Yeol Ahn,et al.  Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems , 2009, IEEE/ACM Transactions on Networking.

[3]  Michael Zink,et al.  Characteristics of YouTube network traffic at a campus network - Measurements, models, and implications , 2009, Comput. Networks.

[4]  Kang G. Shin,et al.  Adaptive control of virtualized resources in utility computing environments , 2007, EuroSys '07.

[5]  Xiaomin Zhang,et al.  Characterization & analysis of a server consolidation benchmark , 2008, VEE '08.

[6]  Karsten Schwan,et al.  VPM tokens: virtual machine-aware power budgeting in datacenters , 2009, Cluster Computing.

[7]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[8]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[9]  Hong Liu,et al.  Energy proportional datacenter networks , 2010, ISCA.

[10]  Waltenegus Dargie,et al.  Does Live Migration of Virtual Machines Cost Energy? , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[11]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[12]  Zongpeng Li,et al.  Youtube traffic characterization: a view from the edge , 2007, IMC '07.

[13]  Satish Narayanasamy,et al.  Detecting and surviving data races using complementary schedules , 2011, SOSP.

[14]  Walter Willinger,et al.  Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level , 1997, TNET.

[15]  José L. Núñez-Yáñez,et al.  Energy efficient Reconfigurable Computing with Adaptive Voltage and Logic scaling , 2014, CARN.

[16]  Niklas Carlsson,et al.  Characterizing web-based video sharing workloads , 2009, WWW '09.

[17]  J. Koomey Worldwide electricity used in data centers , 2008 .

[18]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[19]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[20]  Karsten Schwan,et al.  Vpm tokens: virtual machine-aware power budgeting in datacenters , 2008, HPDC '08.

[21]  John Riedl,et al.  Introduction to special issue on recommender systems , 2011, ACM Trans. Web.

[22]  Amin Vahdat,et al.  Long-term Streaming Media Server Workload Analysis and Modeling , 2003 .

[23]  Daniel Mossé,et al.  Optimized Management of Power and Performance for Virtualized Heterogeneous Server Clusters , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[24]  Peter Desnoyers,et al.  Memory buddies: exploiting page sharing for smart colocation in virtualized data centers , 2009, VEE '09.

[25]  Rajkumar Buyya,et al.  Adaptive threshold-based approach for energy-efficient consolidation of virtual machines in cloud data centers , 2010, MGC '10.

[26]  Gargi Dasgupta,et al.  Server Workload Analysis for Power Minimization using Consolidation , 2009, USENIX Annual Technical Conference.

[27]  Alexander Schill,et al.  Energy-aware service execution , 2011, 2011 IEEE 36th Conference on Local Computer Networks.

[28]  Walter Willinger,et al.  Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level , 1997, TNET.

[29]  Virgílio A. F. Almeida,et al.  A hierarchical characterization of a live streaming media workload , 2006, TNET.

[30]  Jiangchuan Liu,et al.  Statistics and Social Network of YouTube Videos , 2008, 2008 16th Interntional Workshop on Quality of Service.

[31]  Alexander Schill,et al.  Power Consumption Estimation Models for Processors, Virtual Machines, and Servers , 2014, IEEE Transactions on Parallel and Distributed Systems.

[32]  Nagarajan Kandasamy,et al.  Power and Performance Management of Virtualized Computing Environments Via Lookahead Control , 2008, ICAC.

[33]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[34]  Borja Sotomayor,et al.  Virtual Infrastructure Management in Private and Hybrid Clouds , 2009, IEEE Internet Computing.

[35]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[36]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[37]  David Blaauw,et al.  Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits , 2010, Proceedings of the IEEE.