Heavy Tails in Queueing Systems: Impact of Parallelism on Tail Performance

In this paper we quantify the efficiency of parallelism in systems that are prone to failures and exhibit power law processing delays. We characterize the performance of two prototype schemes of parallelism, redundant and split, in terms of both the power law exponent and exact asymptotics of the delay distribution tail. We also develop the optimal splitting scheme which ensures that split always outperforms redundant.

[1]  Yoni Nazarathy,et al.  Optimal File Splitting for Wireless Networks with Concurrent Access , 2009, NET-COOP.

[2]  S. Asmussen,et al.  Applied Probability and Queues , 1989 .

[3]  Ness B. Shroff,et al.  Delay asymptotics with retransmissions and fixed rate codes over erasure channels , 2011, 2011 Proceedings IEEE INFOCOM.

[4]  Desmond P. Taylor,et al.  A Minimum Delay Routing Algorithm Using Distributed Computation , 2007 .

[5]  Predrag R. Jelenkovic,et al.  Is ALOHA Causing Power Law Delays? , 2007, International Teletraffic Congress.

[6]  Andrew Odlyzko,et al.  Large deviations of sums of independent random variables , 1988 .

[7]  Lester Lipsky,et al.  On the completion time distribution for tasks that must restart from the beginning if a failure occurs , 2006, PERV.

[8]  Leonard Kleinrock,et al.  Communication Nets: Stochastic Message Flow and Delay , 1964 .

[9]  Søren Asmussen,et al.  Parallel Computing, Failure Recovery, and Extreme Values , 2008 .

[10]  Predrag R. Jelenkovic,et al.  Can Retransmissions of Superexponential Documents Cause Subexponential Delays? , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[11]  Upendra Dave,et al.  Applied Probability and Queues , 1987 .

[12]  P. Jelenkovic,et al.  Characterizing Heavy-Tailed Distributions Induced by Retransmissions , 2007, Advances in Applied Probability.

[13]  Mario Gerla,et al.  Optimal Routing in a Packet-Switched Computer Network , 1974, IEEE Transactions on Computers.

[14]  Lester Lipsky,et al.  Asymptotic Behavior of Total Times for Jobs That Must Start Over if a Failure Occurs , 2007, Math. Oper. Res..

[15]  Predrag R. Jelenkovic,et al.  Large Deviation Analysis of Subexponential Waiting Times in a Processor-Sharing Queue , 2003, Math. Oper. Res..

[16]  Ness B. Shroff,et al.  Transition from Heavy to Light Tails in Retransmission Durations , 2010, 2010 Proceedings IEEE INFOCOM.