Fair Load-Balancing on Parallel Systems for QoS

Many of the load-balancing algorithms used in parallel systems do not have a concern about response times: tasks (or requests) are simply dispatched to a server, which provides no guarantees about their execution times. When there is a maximum acceptable response time (i.e. deadline) for tasks to be executed, the consequences caused by the adoption of traditional algorithms for load- balancing can be catastrophic: when the system is under heavy loads, a huge amount of tasks miss their deadlines, even the faster ones. Also, the number of longer tasks that ends is very small - near to zero in all cases. In this paper we discuss why the traditional algorithms fail to provide the intended QoS capacity. Then, we present a new algorithm, "On-demand Restriction for Big Tasks (ORBITA)", which is proved, by simulation, to be a fair alternative for stressed systems since tasks of all durations have a chance to complete their execution before their deadlines are reached.

[1]  Erich M. Nahum,et al.  A method for transparent admission control and request scheduling in e-commerce web sites , 2004, WWW '04.

[2]  Mor Harchol-Balter,et al.  On Choosing a Task Assignment Policy for a Distributed Server System , 1998, J. Parallel Distributed Comput..

[3]  Prasant Mohapatra,et al.  An Admission Control Scheme for Predictable Server Response Time for Web Accesses , 2001, WWW '01.

[4]  Keshav Pingali,et al.  A load balancing framework for adaptive and asynchronous applications , 2004, IEEE Transactions on Parallel and Distributed Systems.

[5]  Mor Harchol-Balter Task assignment with unknown duration , 2002, JACM.

[6]  Ness B. Shroff,et al.  Admission control for statistical QoS: theory and practice , 1999, IEEE Netw..

[7]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[8]  Ludmila Cherkasova,et al.  Session-Based Admission Control: A Mechanism for Peak Load Management of Commercial Web Sites , 2002, IEEE Trans. Computers.

[9]  Randolph D. Nelson,et al.  An approximation to the response time for shortest queue routing , 1989, SIGMETRICS '89.

[10]  Sharad Singhal,et al.  Web2K: Bringing QoS to Web Servers , 2000 .

[11]  Randolph D. Nelson,et al.  An Approximation for the Mean Response Time for Shortest Queue Routing with General Inerarrival and Service Times , 1993, Perform. Evaluation.

[12]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1997, TNET.

[13]  Philip S. Yu,et al.  The state of the art in locally distributed Web-server systems , 2002, CSUR.

[14]  Erich M. Nahum,et al.  Achieving Class-Based QoS for Transactional Workloads , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[15]  A. Serra,et al.  Assuring QoS differentiation and load balancing on web servers clusters , 2005, Proceedings of 2005 IEEE Conference on Control Applications, 2005. CCA 2005..

[16]  Nina Bhatti,et al.  Web server support for tiered services , 1999, IEEE Netw..