Shared Content Management in Replicated Web Systems: A Design Framework Using Problem Decomposition, Controlled Simulation, and Feedback Learning

Replication is one of the primary techniques used to improve the quality of distributed content service. It generally reduces user latencies and increases a site's availability. However, to our knowledge, there is no systematic framework that combines the structure of both content and service components of a Web application to design effective replica hosting architectures. Recent advances in interconnected and multiple content distribution network (CDN) architectures render this problem even more complex. In this study, we develop a systematic framework for designing and evaluating large-scale, component-based replication architectures for Web systems that are driven by both the quality and effectiveness of service provisioning on the service network. The proposed framework employs a combination of problem decomposition, configuration evaluation through controlled system simulations, and a neural-network-based feedback learning mechanism in the exploration of the design space. A case study demonstrates the viability of the framework. The framework can be an effective decision support tool for a system designer to systematically explore design options and select an appropriate design configuration that best meets the desired design objectives.

[1]  Grant Holland,et al.  Component-Based Web Page Composition , 2000, OOIS.

[2]  Christos G. Cassandras,et al.  Ordinal optimisation and simulation , 2000, J. Oper. Res. Soc..

[3]  Daniel Dajun Zeng,et al.  Efficient web content delivery using proxy caching techniques , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Philip S. Yu,et al.  Request Redirection Algorithms for Distributed Web Systems , 2003, IEEE Trans. Parallel Distributed Syst..

[5]  R. Ramesh,et al.  A design framework for e-business infrastructure integration and resource management , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[6]  Arthur J. Bernstein,et al.  Bounded ignorance: a technique for increasing concurrency in a replicated system , 1994, TODS.

[7]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[8]  Leandro Navarro-Moldes,et al.  Improving the service time of web clients using server redirection , 2001, PERV.

[9]  Mor Harchol-Balter,et al.  Size-based scheduling to improve web performance , 2003, TOCS.

[10]  Brad Cain,et al.  A Model for Content Internetworking (CDI) , 2003, RFC.

[11]  Michal Szymaniak,et al.  Replication for web hosting systems , 2004, CSUR.

[12]  Amjad Umar E-Business and Distributed Systems Handbook: Networks Module , 2003 .

[13]  Gennady Samorodnitsky,et al.  Variable heavy tailed durations in Internet traffic. Part I. Understanding heavy tails , 2002, Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems.

[14]  Gang Peng,et al.  CDN: Content Distribution Network , 2004, ArXiv.

[15]  Dennis F. Galletta,et al.  Web Site Delays: How Tolerant are Users? , 2004, J. Assoc. Inf. Syst..

[16]  Randy H. Katz,et al.  Clustering Web content for efficient replication , 2002, 10th IEEE International Conference on Network Protocols, 2002. Proceedings..

[17]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[18]  Anindya DattaKaushik World Wide Wait: A Study of Internet Scalability and Cache-Based Approaches to Alleviate It , 2003 .

[19]  B. Huffaker,et al.  Distance Metrics in the Internet , 2002, Anais do 2002 International Telecommunications Symposium.

[20]  Scot Hull Content delivery networks , 2002 .

[21]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[22]  Prashant J. Shenoy,et al.  Adaptive push-pull: disseminating dynamic web data , 2001, WWW '01.

[23]  Abdelhakim Artiba,et al.  Integrating simulation and optimization of manufacturing systems , 2003, IEEE Trans. Syst. Man Cybern. Part C.

[24]  Dakshi Agrawal,et al.  Using certes to infer client response time at the web server , 2004, TOCS.

[25]  Robert H. Kewley,et al.  Computational military tactical planning system , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[26]  Peter Scheuermann,et al.  Content replication in Web++ , 2003, Second IEEE International Symposium on Network Computing and Applications, 2003. NCA 2003..

[27]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[28]  Amit Aggarwal,et al.  RaDaR: A Scalable Architecture for a Global Web Hosting Service , 1999, Comput. Networks.

[29]  Michel Raynal,et al.  Timed consistency for shared distributed objects , 1999, PODC '99.

[30]  Andrew S. Tanenbaum,et al.  Dynamically Selecting Optimal Distribution Strategies for Web Documents , 2002, IEEE Trans. Computers.

[31]  Jussi Kangasharju,et al.  Object replication strategies in content distribution networks , 2002, Comput. Commun..

[32]  Bo Li,et al.  On the optimal placement of web proxies in the Internet , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[33]  H. Raghav Rao,et al.  A comparative analysis of information acquisition mechanisms for discrete resource allocation , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[34]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[35]  Anindya Datta,et al.  Optimizing Caching in Object-Oriented Applications , 2002 .

[36]  Lili Qiu,et al.  On the placement of Web server replicas , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[37]  Christer Carlsson,et al.  Past, present, and future of decision support technology , 2002, Decis. Support Syst..

[38]  Bruce M. Maggs,et al.  Globally Distributed Content Delivery , 2002, IEEE Internet Comput..

[39]  Pablo Rodriguez,et al.  SPREAD: Scalable platform for reliable and efficient automated distribution , 2000, Comput. Networks.

[40]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[41]  Dinesh C. Verma,et al.  Content Distribution Networks: An Engineering Approach , 2001 .

[42]  Stefano Ceri,et al.  Designing Data-Intensive Web Applications , 2002 .

[43]  Michal Szymaniak,et al.  Netairt: A DNS-based Redirection System for Apache , 2003 .

[44]  J. P. Kelly,et al.  New advances for wedding optimization and simulation , 1999, WSC'99. 1999 Winter Simulation Conference Proceedings. 'Simulation - A Bridge to the Future' (Cat. No.99CH37038).

[45]  Dinesh C. Verma,et al.  Content Distribution Networks , 2002 .