Cluster-based fitting of phase-type distributions to empirical data

We present a clustering-based fitting approach for phase-type distributions that is particularly suited to capture common characteristics of empirical data sets. The distributions fitted by this approach are especially useful in efficient simulation approaches. We describe the Hyper-* tool, which implements the algorithm and offers a user-friendly interface to efficient phase-type fitting. We provide a comparison of cluster-based fitting with segmentation-based approaches and other algorithms and show that clustering provides good results for typical empirical data sets.

[1]  Philipp Reinecke,et al.  On the Cost of Generating PH-distributed Random Numbers , 2009 .

[2]  R. Sadre,et al.  Fitting heavy-tailed HTTP traces with the new stratified EM-algorithm , 2008, 2008 4th International Telecommunication Networking Workshop on QoS in Multiservice IP Networks.

[3]  Miklós Telek,et al.  Acceptance-Rejection Methods for Generating Random Variates from Matrix Exponential Distributions and Rational Arrival Processes , 2011, MAM.

[4]  Miklós Telek,et al.  PhFit: A General Phase-Type Fitting Tool , 2002, Computer Performance Evaluation / TOOLS.

[5]  A. David,et al.  The least variable phase type distribution is Erlang , 1987 .

[6]  Philipp Reinecke,et al.  A simulation study on the effectiveness of restart and rejuvenation to mitigate the effects of software ageing , 2010, 2010 IEEE Second International Workshop on Software Aging and Rejuvenation.

[7]  A. Horváth,et al.  Matching Three Moments with Minimal Acyclic Phase Type Distributions , 2005 .

[8]  Juan F. Pérez,et al.  jPhase: an object-oriented tool for modeling phase-type distributions , 2006, SMCtools '06.

[9]  Philipp Reinecke,et al.  Model-Based Evaluation and Improvement of PTP Syntonisation Accuracy in Packet-Switched Backhaul Networks for Mobile Applications , 2011, EPEW.

[10]  Alma Riska,et al.  Efficient fitting of long-tailed data sets into phase-type distributions , 2002, PERV.

[11]  Marcel F. Neuts,et al.  Matrix-Geometric Solutions in Stochastic Models , 1981 .

[12]  Miklós Telek,et al.  PhFit: a general phase-type fitting tool , 2002, Proceedings International Conference on Dependable Systems and Networks.

[13]  John A. Nelder,et al.  Nelder-Mead algorithm , 2009, Scholarpedia.

[14]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[15]  Jin Liu,et al.  Segment-based adaptive hyper-Erlang model for long-tailed network traffic approximation , 2008, The Journal of Supercomputing.

[16]  Anja Feldmann,et al.  Fitting Mixtures of Exponentials to Long-Tail Distributions to Analyze Network , 1998, Perform. Evaluation.

[17]  Marcel F. Neuts,et al.  Matrix-geometric solutions in stochastic models - an algorithmic approach , 1982 .

[18]  Lei Li,et al.  Hyper-Erlang Based Model for Network Traffic Approximation , 2005, ISPA.

[19]  Peter Buchholz,et al.  A Novel Approach for Phase-Type Fitting with the EM Algorithm , 2006, IEEE Transactions on Dependable and Secure Computing.

[20]  A. Cumani On the canonical representation of homogeneous markov processes modelling failure - time distributions , 1982 .

[21]  C. Commault,et al.  Sparse representations of phase-type distributions , 1999 .

[22]  Evgenia Smirni,et al.  KPC-Toolbox: Simple Yet Effective Trace Fitting Using Markovian Arrival Processes , 2008, 2008 Fifth International Conference on Quantitative Evaluation of Systems.

[23]  Felix Juraschek,et al.  Properties and topology of the DES-Testbed , 2011 .

[24]  Peter Buchholz,et al.  ProFiDo - The Processes Fitting Toolkit Dortmund , 2010, 2010 Seventh International Conference on the Quantitative Evaluation of Systems.

[25]  Allan Clark,et al.  State-Aware Performance Analysis with eXtended Stochastic Probes , 2008, EPEW.

[26]  Ramin Sadre,et al.  Fitting World Wide Web request traces with the EM-algorithim , 2001, SPIE ITCom.

[27]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[28]  Philipp Reinecke,et al.  Reducing the Cost of Generating APH-Distributed Random Numbers , 2010, MMB/DFT.