Improving Availability and Performance with Application-Specific Data Replication

The emerging edge services architecture promises to improve the availability and performance of Web services by replicating servers at geographically distributed sites. A key challenge in such systems is data replication and consistency, so that edge server code can manipulate shared data without suffering the availability and performance penalties that would be incurred by accessing a traditional centralized database. This work explores using a distributed object architecture to build an edge service data replication system for an e-commerce application, the TPC-W benchmark, which simulates an online bookstore. We take advantage of application-specific semantics to design distributed objects that each manages a specific subset of shared information using simple and effective consistency models. Our experimental results show that by slightly relaxing consistency within individual distributed objects, our application realizes both high availability and excellent performance. For example, in one experiment, we find that our object-based edge server system provides five times better response time over a traditional centralized cluster architecture and a factor of nine improvement over an edge service system that distributes code but retains a centralized database.

[1]  M. Herlihy A quorum-consensus replication method for abstract data types , 1986, TOCS.

[2]  Marc Shapiro,et al.  Structure and Encapsulation in Distributed Systems: The Proxy Principle , 1986, ICDCS.

[3]  Mostafa H. Ammar,et al.  Optimizing vote and quorum assignments for reading and writing replicated data , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[4]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[5]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[6]  Maria Ebling,et al.  Exploiting weak connectivity for mobile file access , 1995, SOSP.

[7]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[8]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[9]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[10]  V. Paxson End-to-end routing behavior in the internet , 2006, CCRV.

[11]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[12]  David E. Culler,et al.  Using smart clients to build scalable services , 1997 .

[13]  Arun Iyengar,et al.  A Scalable and Highly Available System for Serving Dynamic Data at Frequently Accessed Web Sites , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[14]  Matteo Frigo,et al.  The weakest reasonable memory model , 1998 .

[15]  Ellen W. Zegura,et al.  Self-organizing wide-area network caches , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[16]  Ellen W. Zegura,et al.  A novel server selection technique for improving the response time of a replicated service , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[17]  B. Bershad,et al.  Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service , 1999, SOSP.

[18]  Andrew S. Tanenbaum,et al.  Globe: a wide area distributed system , 1999, IEEE Concurr..

[19]  Jin Zhang,et al.  Active Cache: caching dynamic contents on the Web , 1999, Distributed Syst. Eng..

[20]  Arun Iyengar,et al.  A scalable system for consistently caching dynamic Web data , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[21]  Henry M. Levy,et al.  Optimistic Replication for Internet Data Services , 2000, DISC.

[22]  Arun Iyengar,et al.  A publishing system for efficiently creating dynamic Web content , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[23]  David E. Culler,et al.  Scalable, distributed data structures for internet service construction , 2000, OSDI.

[24]  Prashant J. Shenoy,et al.  Adaptive leases: a strong consistency mechanism for the World Wide Web , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[25]  Yin Zhang,et al.  The Stationarity of Internet Path Properties: Routing, Loss, and Throughput , 2000 .

[26]  Bharat B Chandra,et al.  Web Workloads Influencing Disconnected Service Access (Masters Thesis) , 2001 .

[27]  Michael Dahlin,et al.  Engineering server-driven consistency for large scale dynamic Web services , 2001, WWW '01.

[28]  Jerome A. Rolia,et al.  Characterizing the scalability of a large web-based shopping system , 2001, ACM Trans. Internet Techn..

[29]  Bharat Baddepudi V. Chandra,et al.  Web Workloads Influencing Disconnected Service Access , 2001 .

[30]  Eric A. Brewer,et al.  Lessons from Giant-Scale Services , 2001, IEEE Internet Comput..

[31]  Alan L. Cox,et al.  Bottleneck Characterization of Dynamic Web Site Benchmarks , 2002 .

[32]  Amin Vahdat,et al.  Minimal replication cost for availability , 2002, PODC '02.

[33]  Amin Vahdat,et al.  Minimal Cost Replication for Availability , 2002, PODC 2002.

[34]  Michael Dahlin,et al.  Data Invalidation and Prefetching for Transparent Edge-Service Replica-tion , 2002 .

[35]  Marianne Shaw,et al.  Scale and performance in the Denali isolation kernel , 2002, OSDI '02.

[36]  Amin Vahdat,et al.  Active Names: flexible location and transport of wide-area resources , 1999, Proceedings DARPA Active Networks Conference and Exposition.

[37]  Mendel Rosenblum,et al.  The vMatrix: A Network of Virtual Machine Monitors for Dynamic Content Distribution , 2002 .

[38]  Michael Dahlin,et al.  End-to-end WAN service availability , 2001, TNET.

[39]  Javier García,et al.  TPC-W E-Commerce Benchmark Evaluation , 2003, Computer.

[40]  Amin Vahdat,et al.  The costs and limits of availability for replicated services , 2001, TOCS.

[41]  K. Walsh,et al.  Enabling Wide-Area Replication of Database Services with Continuous Consistency , .