Although queueing models have long been used to model the performance of computer systems, they are out of favor with practitioners, because they have a reputation for requiring unrealistic distributional assumptions. In fact, these distributional assumptions are used mainly to facilitate analytic approximations such as asymptotics and large-deviations bounds. In this paper, we analyze queueing networks from the probabilistic modeling perspective, applying inference methods from graphical models that afford significantly more modeling flexibility. In particular, we present a Gibbs sampler and stochastic EM algorithm for networks of M/M/1 FIFO queues. As an application of this technique, we localize performance problems in distributed systems from incomplete system trace data. On both synthetic networks and an actual distributed Web application, the model accurately recovers the system's service time using 1% of the available trace data.
[1]
G. C. Wei,et al.
A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms
,
1990
.
[2]
Asser N. Tantawi,et al.
An analytical model for multi-tier internet services and its applications
,
2005,
SIGMETRICS '05.
[3]
Gregory R. Ganger,et al.
Ironmodel: robust performance models in the wild
,
2008,
SIGMETRICS '08.
[4]
David E. Culler,et al.
An architecture for highly concurrent, well-conditioned internet services
,
2002
.
[5]
G. Celeux,et al.
A stochastic approximation type EM algorithm for the mixture problem
,
1992
.
[6]
David A. Patterson,et al.
Response-Time Modeling for Resource Allocation and Energy-Informed SLAs
,
2007
.