论文信息 - Proactive Replication for Data Durability

Proactive Replication for Data Durability

Many wide-area storage systems replicate data for durability. A common way of maintaining the replicas is to detect node failures and respond by creating additional copies of objects that were stored on failed nodes and hence suffered a loss of redundancy. Reactive techniques can minimize total bytes sent since they only create replicas as needed; however, they can create spikes in network use after a failure. These spikes may overwhelm application traffic and can make it difficult to provision bandwidth. This paper explores a proactive approach that creates additional copies not in response to failures, but periodically at a fixed low rate. We introduce Tempo, a distributed hash table that allows each user to specify a maximum maintenance bandwidth and uses it to perform proactive replication. Results from a simulation study suggest that Tempo can deliver high durability despite only using several kilobytes per second of bandwidth, comparable to state-ofthe-art reactive systems.

[1] Emin Gün Sirer,et al. Beehive: O(1) Lookup Performance for Power-Law Query Distributions in Peer-to-Peer Overlays , 2004, NSDI.

[2] Robert Morris,et al. A distributed hash table , 2006 .

[3] Brighten Godfrey,et al. OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[4] Stefan Savage,et al. Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[5] Miguel Castro,et al. Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.

[6] John Kubiatowicz,et al. Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[7] Rodrigo Rodrigues,et al. High Availability in DHTs: Erasure Coding vs. Replication , 2005, IPTPS.

[8] David R. Karger,et al. OverCite: A Cooperative Digital Research Library , 2005, IPTPS.

[9] Geoffrey M. Voelker,et al. On Object Maintenance in Peer-to-Peer Systems , 2006, IPTPS.

[10] Robert Tappan Morris,et al. Designing a DHT for Low Latency and High Throughput , 2004, NSDI.

[11] Robert Tappan Morris,et al. Bandwidth-efficient management of DHT routing tables , 2005, NSDI.

[12] Marius A. Eriksen,et al. Trickle: A Userland Bandwidth Shaper for UNIX-like Systems , 2005, USENIX Annual Technical Conference, FREENIX Track.

[13] Joseph Pasquale,et al. Analysis of Long-Running Replicated Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[14] M. Dahlin,et al. TCP Nice: a mechanism for background transfers , 2002, OSDI '02.

[15] Kirk L. Johnson,et al. Overcast: reliable multicasting with on overlay network , 2000, OSDI.

[16] Ben Y. Zhao,et al. Pond: The OceanStore Prototype , 2003, FAST.

[17] John Kubiatowicz,et al. Handling churn in a DHT , 2004 .

[18] Liuba Shrira,et al. The design of a robust peer-to-peer system , 2002, EW 10.

[19] KyoungSoo Park,et al. CoMon: a mostly-scalable monitoring system for PlanetLab , 2006, OPSR.

[20] David E. Culler,et al. A blueprint for introducing disruptive technology into the Internet , 2003, CCRV.

[21] Andreas Haeberlen,et al. Glacier: highly durable, decentralized storage despite massive correlated failures , 2005, NSDI.

[22] J. Kubiatowicz,et al. Long-Term Data Maintenance in Wide-Area Storage Systems : A Quantitative Approach , 2005 .

[23] Andreas Haeberlen,et al. Experiences in building and operating ePOST, a reliable peer-to-peer application , 2006, EuroSys '06.

[24] Andreas Haeberlen,et al. Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[25] Rodrigo Rodrigues,et al. Proceedings of Hotos Ix: the 9th Workshop on Hot Topics in Operating Systems Hotos Ix: the 9th Workshop on Hot Topics in Operating Systems High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two , 2022 .

[26] G. Cox,et al. ~ " " " ' l I ~ " " -" . : -· " J , 2006 .