Efficient fitting of long-tailed data sets into hyperexponential distributions

We propose a new technique for fitting long-tailed data sets into hyperexponential distributions. The approach partitions the data set in a divide and conquer fashion and uses the expectation-maximization (EM) algorithm to fit the data of each partition into a hyperexponential distribution. The fitting results of all partitions are combined to generate the fitting for the entire data set. The new method is accurate and efficient and allows one to apply existing analytic tools to analyze the behavior of queueing systems that operate under workloads that exhibit long-tail behavior, such as queues in Internet-related systems.

[1]  A. Horváth,et al.  Approximating heavy tailed behaviour with Phase type distributions , 2000 .

[2]  Martin Arlitt,et al.  A workload characterization study of the 1998 World Cup Web site , 2000, IEEE Netw..

[3]  Anja Feldmann,et al.  Fitting Mixtures of Exponentials to Long-Tail Distributions to Analyze Network , 1998, Perform. Evaluation.

[4]  Mark S. Squillante,et al.  MATRIX-ANALYTIC ANALYSIS OF A MAP/PH/1 QUEUE FITTED TO WEB SERVER DATA , 2002 .

[5]  Michael A. Johnson,et al.  Matching moments to phase distri-butions: mixtures of Erlang distribution of common order , 1989 .

[6]  Anja Feldmann,et al.  Fitting mixtures of exponentials to long-tail distributions to analyze network performance models , 1997, Proceedings of INFOCOM '97.

[7]  Peter G. Taylor,et al.  Advances in Algorithmic Methods for Stochastic Models , 2000 .

[8]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[9]  Carey L. Williamson,et al.  Internet Web servers: workload characterization and performance implications , 1997, TNET.

[10]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[11]  Walter Willinger,et al.  On the Self-Similar Nature of Ethernet Traffic ( extended version ) , 1995 .

[12]  Ren Asmussen,et al.  Fitting Phase-type Distributions via the EM Algorithm , 1996 .

[13]  Marcel F. Neuts,et al.  Structured Stochastic Matrices of M/G/1 Type and Their Applications , 1989 .

[14]  S. Asmussen,et al.  Marked point processes as limits of Markovian arrival streams , 1993 .

[15]  Martin Arlitt,et al.  Workload Characterization of the 1998 World Cup Web Site , 1999 .

[16]  Alma Riska,et al.  MAMSolver: A Matrix Analytic Methods Tool , 2002, Computer Performance Evaluation / TOOLS.