A hierarchical characterization of a live streaming media workload

We present a thorough characterization of what we believe to be the first significant live Internet streaming media workload in the scientific literature. Our characterization of over 3.5 million requests spanning a 28-day period is done at three increasingly granular levels, corresponding to clients, sessions, and transfers. Our findings support two important conclusions. First, we show that the nature of interactions between users and objects is fundamentally different for live versus stored objects. Access to stored objects is user driven, whereas access to live objects is object driven. This reversal of active/passive roles of users and objects leads to interesting dualities. For instance, our analysis underscores a Zipf-like profile for user interest in a given object, which is in contrast to the classic Zipf-like popularity of objects for a given user. Also, our analysis reveals that transfer lengths are highly variable and that this variability is due to client stickiness to a particular live object, as opposed to structural (size) properties of objects. Second, by contrasting two live streaming workloads from two radically different applications, we conjecture that some characteristics of live media access workloads are likely to be highly dependent on the nature of the live content being accessed. This dependence is clear from the strong temporal correlation observed in the traces, which we attribute to the impact of synchronous access to live content. Based on our analysis, we present a model for live media workload generation that incorporates many of our findings, and which we implement in GISMO.

[1]  Azer Bestavros,et al.  GISMO: a Generator of Internet Streaming Media Objects and workloads , 2001, PERV.

[2]  John Heidemann,et al.  Multi-scale Validation of Structural Models of Audio Traffic , 2002 .

[3]  J. Padhye,et al.  An Empiricial study of Client Interactions With continuous-media courseware server , 1997 .

[4]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[5]  Alec Wolman,et al.  Measurement and Analysis of a Streaming Media Workload , 2001, USITS.

[6]  Mary K. Vernon,et al.  Analysis of educational media server workloads , 2001, NOSSDAV '01.

[7]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[8]  Virgílio A. F. Almeida,et al.  Capacity Planning for Web Services: Metrics, Models, and Methods , 2001 .

[9]  Hayder Radha,et al.  Measurement study of low-bitrate internet video streaming , 2001, IMW '01.

[10]  Allen B. Downey The structural cause of file size distributions , 2001, SIGMETRICS '01.

[11]  Azer Bestavros,et al.  Changes in Web client access patterns: Characteristics and caching implications , 1999, World Wide Web.

[12]  Virgílio A. F. Almeida,et al.  In search of invariants for e-business workloads , 2000, EC '00.

[13]  Azer Bestavros,et al.  Scalability of multicast delivery for non-sequential streaming access , 2002, SIGMETRICS '02.

[14]  Michael Mitzenmacher,et al.  Dynamic Models for File Sizes and Double Pareto Distributions , 2004, Internet Math..

[15]  Carey L. Williamson,et al.  Internet Web servers: workload characterization and performance implications , 1997, TNET.

[16]  Jacobus Van der Merwe,et al.  Streaming Video Traffic : Characterization and Network Impact , 2002 .

[17]  Lili Qiu,et al.  The content and access dynamics of a busy Web site: findings and implications , 2000 .

[18]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[19]  A. Mena,et al.  An empirical study of real audio traffic , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[20]  Peter Druschel,et al.  Measuring the Capacity of a Web Server , 1997, USENIX Symposium on Internet Technologies and Systems.

[21]  Eric A. Brewer,et al.  System Design Issues for Internet Middleware Services: Deductions from a Large Client Trace , 1997, USENIX Symposium on Internet Technologies and Systems.

[22]  Soam Acharya,et al.  Experiment to characterize videos stored on the Web , 1997, Electronic Imaging.

[23]  Peter Parnes,et al.  Characterizing user access to videos on the World Wide Web , 1999, Electronic Imaging.

[24]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[25]  Gregory D. Abowd,et al.  Workload of a Media-Enhanced Classroom Server , 2000 .

[26]  Virgílio A. F. Almeida,et al.  Capacity Planning for Web Performance: Metrics, Models, and Methods , 1998 .

[27]  Virgílio A. F. Almeida,et al.  A hierarchical characterization of a live streaming media workload , 2006 .

[28]  Shudong Jin,et al.  GISMO: Generator of Streaming Media Objects and Workloads , 2001, SIGMETRICS 2001.

[29]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[30]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[31]  Mark Claypool,et al.  An empirical study of realvideo performance across the internet , 2001, IMW '01.

[32]  Virgílio A. F. Almeida,et al.  Characterizing reference locality in the WWW , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[33]  Allen B. Downey,et al.  The structural cause of file size distributions , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.