Load balancing in distributed Web server systems with partial document replication

How documents of a Web site are replicated and where they are placed among the server nodes have an important bearing on balance of load in a geographically distributed Web server (DWS) system. The traffic generated due to movements of documents at runtime could also affect the performance of the DWS system. In this paper, we prove that minimizing such traffic is NP-hard. We propose a new document distribution scheme that periodically performs partial replication of a site's documents at selected server locations to maintain load balancing. Several approximation algorithms are used in it to minimize traffic generated. The simulation results show that this scheme can achieve better load balancing than a dynamic scheme, while the internal traffic it causes has a negligible effect on the system's performance.

[1]  Roland Schemers lbnamed: A Load Balancing Name Server in Perl , 1995, LISA.

[2]  Oscar H. Ibarra,et al.  SWEB: towards a scalable World Wide Web server on multicomputers , 1996, Proceedings of International Conference on Parallel Processing.

[3]  Daniel M. Dias,et al.  A scalable and highly available web server , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[4]  David E. Culler,et al.  Using smart clients to build scalable services , 1997 .

[5]  Amit Aggarwal,et al.  Performance of Dynamic Replication Schemes for an Internet Hosting Service , 1998 .

[6]  Sampath Rangarajan,et al.  Data distribution algorithms for load balanced fault-tolerant Web access , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.

[7]  Carey Williamson,et al.  Achieving Load Balance and Efiective Caching in Clustered Web Servers , 1999 .

[8]  Erich M. Nahum,et al.  Locality-aware request distribution in cluster-based network servers , 1998, ASPLOS VIII.

[9]  Philip S. Yu,et al.  Analysis of Task Assignment Policies in Scalable Distributed Web-Server Systems , 1998, IEEE Trans. Parallel Distributed Syst..

[10]  Edward A. Fox,et al.  Caching Proxies: Limitations and Potentials , 1995, WWW.

[11]  Wei Sun,et al.  ADAPTLOAD: effective balancing in clustered web servers under transient load conditions , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[12]  Hyeong-Ah Choi,et al.  Approximation algorithms for data distribution with load balancing of web servers , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[13]  Daniel A. Reed,et al.  NCSA's World Wide Web Server: Design and Performance , 1995, Computer.

[14]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[15]  Philip S. Yu,et al.  Geographic load balancing for scalable distributed Web systems , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[16]  Huiqun Liu,et al.  Scalable Web server architectures , 1997, Proceedings Second IEEE Symposium on Computer and Communications.

[17]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[18]  Cho-Li Wang,et al.  Document distribution algorithm for load balancing on an extensible Web server architecture , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[19]  Mon-Yen Luo,et al.  A content placement and management system for distributed Web-server systems , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[20]  Scott M. Baker,et al.  Scalable web server design for distributed data management , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[21]  Robert Martin McCool,et al.  Lessons Learned Administering Netscape's Internet Site , 1997, IEEE Internet Comput..

[22]  Lili Qiu,et al.  On the placement of Web server replicas , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[23]  Lili Qiu,et al.  The content and access dynamics of a busy Web site: findings and implications , 2000 .

[24]  Philip S. Yu,et al.  Adaptive TTL schemes for load balancing of distributed Web servers , 1997, PERV.

[25]  Jussi Kangasharju,et al.  Object replication strategies in content distribution networks , 2002, Comput. Commun..

[26]  Quanzhong Li,et al.  Distributed cooperative Apache web server , 2001, WWW '01.

[27]  Venkata N. Padmanabhan,et al.  The content and access dynamics of a busy web site: findings and implicatins , 2000, SIGCOMM.

[28]  Philip S. Yu,et al.  Dynamic Load Balancing on Web-Server Systems , 1999, IEEE Internet Comput..