Optimal Load Balancing on Distributed Homogeneous Unreliable Processors

We consider optimal load balancing in a distributed computing environment consisting of homogeneous unreliable processors. Each processor receives its own sequence of tasks from outside users, some of which can be redirected to the other processors. Processing times are independent and identically distributed with an arbitrary distribution. The arrival sequence of outside tasks to each processor may be arbitrary as long as it is independent of the state of the system. Processors may fail, with arbitrary failure and repair processes that are also independent of the state of the system. The only information available to a processor is the history of its decisions for routing work to other processors, and the arrival times of its own arrival sequence. We prove the optimality of the round-robin policy, in which each processor sends all the tasks that can be redirected to each of the other processors in turn. We show that, among all policies that balance workload, round robin stochastically minimizes the nth task completion time for all n, and minimizes response times and queue lengths in a separable increasing convex sense for the entire system. We also show that if there is a single centralized controller, round-robin is the optimal policy, and a single controller using round-robin routing is better than the optimal distributed system in which each processor routes its own arrivals. Again "optimal" and "better" are in the sense of stochastically minimizing task completion times, and minimizing response time and queue lengths in the separable increasing convex sense.

[1]  J. Ben Atkinson,et al.  An Introduction to Queueing Networks , 1988 .

[2]  Ward Whitt,et al.  Comparison methods for queues and other stochastic models , 1986 .

[3]  N. L. Lawrie,et al.  Comparison Methods for Queues and Other Stochastic Models , 1984 .

[4]  Christos G. Cassandras,et al.  Extremal properties of the SNQ and the LNQ policies in finite capacity systems with state-dependent service rates , 1991, [1991] Proceedings of the 30th IEEE Conference on Decision and Control.

[5]  Tapani Lehtonen,et al.  On the optimality of the shortest line discipline , 1984 .

[6]  Ronald Menich,et al.  Optimally of shortest queue routing for dependent service stations , 1987, 26th IEEE Conference on Decision and Control.

[7]  Z. Liu,et al.  Optimality of the round-robin routing policy , 1992, Journal of Applied Probability.

[8]  D. Daley Certain optimality properties of the first-come first-served discipline for G/G/s queues , 1987 .

[9]  Zhen Liu,et al.  Stochastic Scheduling in in-Forest Networks , 1994, Advances in Applied Probability.

[10]  Oldrich A. Vasicek,et al.  Technical Note - An Inequality for the Variance of Waiting Time under a General Queuing Discipline , 1977, Oper. Res..

[11]  Erol Gelenbe,et al.  Load Balancing Pragmatics , 1993 .

[12]  Christos G. Cassandras,et al.  Optimal routing and buffer allocation for a class of finite capacity queueing systems , 1992 .

[13]  Ger Koole,et al.  On the Pathwise Optimal Bernoulli Routing Policy for Homogeneous Parallel Servers , 1996, Math. Oper. Res..

[14]  Alain Jean-Marie,et al.  Parallel queues with resequencing , 1993, JACM.

[15]  Cheng-Shang Chang,et al.  A new ordering for stochastic majorization: theory and applications , 1992, Advances in Applied Probability.

[16]  Guy Pujolle,et al.  Introduction to queueing networks , 1987 .

[17]  Erol Gelenbe,et al.  Multiprocessor Performance , 1990, SIGMETRICS Perform. Evaluation Rev..

[18]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[19]  Donald F. Towsley,et al.  Effects of service disciplines inG/GI/s queueing systems , 1992, Ann. Oper. Res..

[20]  Michael Pinedo,et al.  A note on queues with Bernoulli routing , 1990, 29th IEEE Conference on Decision and Control.

[21]  R. Wolff AN UPPER BOUND FOR MULTI-CHANNEL QUEUES , 1977 .

[22]  Bruce E. Hajek,et al.  The Proof of a Folk Theorem on Queuing Delay with Applications to Routing in Networks , 1983, JACM.

[23]  O J Boxma,et al.  Optimization of Static Traac Allocation Policies , 1994 .

[24]  John N. Tsitsiklis,et al.  Optimal distributed policies for choosing among multiple servers , 1991, [1991] Proceedings of the 30th IEEE Conference on Decision and Control.

[25]  Anthony Ephremides,et al.  A simple dynamic routing problem , 1980 .

[26]  Ger Koole,et al.  On the Optimality of the Generalized Shortest Queue Policy , 1990, Probability in the Engineering and Informational Sciences.

[27]  J. H. van Schuppen,et al.  Distributed routing for load balancing , 1989, Proc. IEEE.

[28]  Ronald W. Wolff,et al.  Upper bounds on work in system for multichannel queues , 1987, Journal of Applied Probability.

[29]  Alain Jean-Marie,et al.  Stochastic comparisons for queueing models via random sums and intervals , 1992, Advances in Applied Probability.

[30]  Ward Whitt,et al.  Deciding Which Queue to Join: Some Counterexamples , 1986, Oper. Res..

[31]  Bruce E. Hajek,et al.  Extremal Splittings of Point Processes , 1985, Math. Oper. Res..

[32]  Yung-Terng Wang,et al.  Load Sharing in Distributed Systems , 1985, IEEE Transactions on Computers.

[33]  Onno J. Boxma,et al.  Optimization of Static Traffic Allocation Policies , 1994, Theor. Comput. Sci..