Neptune: Scalable Replication Management and Programming Support for Cluster-based Network Services

Previous research has addressed the scalability and availability issues associated with the construction of cluster-based network services. This paper studies the clustering of replicated services when the persistent service data is frequently updated. To this end we propose Neptune, an infrastructural middleware that provides a flexible interface to aggregate and replicate existing service modules. Neptune accommodates a variety of underlying storage mechanisms, maintains dynamic and location-transparent service mapping to isolate faulty modules and enforce replica consistency. Furthermore, it allows efficient use of a multi-level replica consistency model with staleness control at its highest level. This paper describes Neptune's overall architecture, data replication support, and the results of our performance evaluation.

[1]  Kenneth P. Birman,et al.  The design and architecture of the Microsoft Cluster Service-a practical approach to high-availability and scalability , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[2]  David E. Culler,et al.  Distributed data structures for internet service construction , 2000, USENIX Symposium on Operating Systems Design and Implementation.

[3]  Amin Vahdat,et al.  Design and evaluation of a continuous consistency model for replicated services , 2000, OSDI.

[4]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[5]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[6]  Barbara Liskov,et al.  Lazy consistency using loosely synchronized clocks , 1997, PODC '97.

[7]  Eric A. Brewer,et al.  Harvest, yield, and scalable tolerant systems , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[8]  Erich M. Nahum,et al.  Locality-aware request distribution in cluster-based network servers , 1998, ASPLOS VIII.

[9]  Tao Yang,et al.  Cooperative caching of dynamic content on a distributed Web server , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[10]  David E. Culler,et al.  The multispace: an evolutionary platform for infrastructural services , 1999 .

[11]  Tao Yang,et al.  Class-based cache management for dynamic Web content , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[12]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[13]  S. S. Ravi,et al.  Deferred updates and data placement in distributed databases , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[14]  Hector Garcia-Molina,et al.  Elections in a Distributed Computing System , 1982, IEEE Transactions on Computers.

[15]  Divyakant Agrawal,et al.  Epidemic algorithms in replicated databases (extended abstract) , 1997, PODS.

[16]  David E. Culler,et al.  Scalable, distributed data structures for internet service construction , 2000, OSDI.

[17]  Steven McCanne,et al.  A model, analysis, and protocol framework for soft state-based communication , 1999, SIGCOMM '99.

[18]  Sanjay R. Radia,et al.  The SunSCALR framework for Internet servers , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[19]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[20]  Henry M. Levy,et al.  Optimistic Replication for Internet Data Services , 2000, DISC.

[21]  Eric A. Brewer,et al.  Cluster-based scalable network services , 1997, SOSP.

[22]  Avishai Wool,et al.  Replication, consistency, and practicality: are these mutually exclusive? , 1998, SIGMOD '98.

[23]  Tao Yang,et al.  Demand-driven service differentiation in cluster-based network servers , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[24]  Oscar H. Ibarra,et al.  SWEB: towards a scalable World Wide Web server on multicomputers , 1996, Proceedings of International Conference on Parallel Processing.