Heavy-tailed distributions, a.k.a. power-law distributions, have been observed in many natural phenomena ranging from physical phenomena to sociological phenomena. Recently heavy-tailed distributions have been discovered in computer systems. In particular the sizes (service demands) of computing jobs have been found to exhibit a heavy-tailed (power-law) distribution. Most previous analytic work in the area of computer system design has assumed that job sizes (service demands) are exponentially distributed. Many of the policies, algorithms, and general rules-of-thumb which are currently used in computer systems originated from analyses which assumed an exponentially-distributed workload. In this paper we argue that we need to reevaluate existing computer system algorithms in light of the discovery of heavy-tailed workloads. We argue that an algorithm which is optimal under an exponentially distributed workload may be very far from optimal when the workload is heavy-tailed. We demonstrate this point via three common problems in the design of computer systems: 1. Choosing a migration policy in a network of workstations. 2. Choosing a task assignment policy for a distributed server. 3. Scheduling HTTP requests within a Web server. For each problem above, we show that the answer is highly dependent on the job size distribution. We show how to do analysis under heavy-tailed job size distributions. We then show that our analysis leads us to policies whose performance improves greatly over commonly-used solutions, in some cases by orders of magnitude. This paper is a compilation of a sequence of papers which the author co-wrote: 9, 6, 7, 8, 1]. Far more detail is contained in the original papers.
[1]
Mor Harchol-Balter,et al.
On Choosing a Task Assignment Policy for a Distributed Server System
,
1998,
J. Parallel Distributed Comput..
[2]
M. Crovella,et al.
Heavy-tailed probability distributions in the World Wide Web
,
1998
.
[3]
Azer Bestavros,et al.
Self-similarity in World Wide Web traffic: evidence and possible causes
,
1997,
TNET.
[4]
Mor Harchol-Balter,et al.
Exploiting process lifetime distributions for dynamic load balancing
,
1995,
SIGMETRICS.
[5]
J M Carlson,et al.
Highly optimized tolerance: a mechanism for power laws in designed systems.
,
1999,
Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[6]
Mor Harchol-Balter.
The Case for SRPT Scheduling in Web Servers
,
1998
.
[7]
Lester Lipsky,et al.
Long-lasting transient conditions in simulations with heavy-tailed workloads
,
1997,
WSC '97.
[8]
Timothy J. O'Donnell,et al.
Analysis of the early workload on the Cornell Theory Center IBM SP2
,
1996,
SIGMETRICS '96.
[9]
Teunis J. Ott,et al.
Load-balancing heuristics and process behavior
,
1986,
SIGMETRICS '86/PERFORMANCE '86.
[10]
Ronald W. Wolff,et al.
Stochastic Modeling and the Theory of Queues
,
1989
.
[11]
Mor Harchol-Balter,et al.
Connection Scheduling in Web Servers
,
1999,
USENIX Symposium on Internet Technologies and Systems.
[12]
Sally Floyd,et al.
Wide-Area Traffic: The Failure of Poisson Modeling
,
1994,
SIGCOMM.