UsenetDHT: A Low-Overhead Design for Usenet

Usenet is a popular distributed messaging and file sharing service: servers in Usenet flood articles over an overlay network to fully replicate articles across all servers. However, replication of Usenet's full content requires that each server pay the cost of receiving (and storing) over 1 Tbyte/day. This paper presents the design and implementation of UsenetDHT, a Usenet system that allows a set of cooperating sites to keep a shared, distributed copy of Usenet articles. UsenetDHT consists of client-facing Usenet NNTP front-ends and a distributed hash table (DHT) that provides shared storage of articles across the wide area. This design allows participating sites to partition the storage burden, rather than replicating all Usenet articles at all sites. UsenetDHT requires a DHT that maintains durability despite transient and permanent failures, and provides high storage performance. These goals can be difficult to provide simultaneously: even in the absence of failures, verifying adequate replication levels of large numbers of objects can be resource intensive, and interfere with normal operations. This paper introduces Passing Tone, a new replica maintenance algorithm for DHash [7] that minimizes the impact of monitoring replication levels on memory and disk resources by operating with only pairwise communication. Passing Tone's implementation provides performance by using data structures that avoid disk accesses and enable batch operations. Microbenchmarks over a local gigabit network demonstrate that the total system throughput scales linearly as servers are added, providing 5.7 Mbyte/s of write bandwidth and 7 Mbyte/s of read bandwidth per server. UsenetDHT is currently deployed on a 12-server network at 7 sites running Passing Tone over the wide-area: this network supports our research laboratory's live 2.5 Mbyte/s Usenet feed and 30.6 Mbyte/s of synthetic read traffic. These results suggest a DHT-based design may be a viable way to redesign Usenet and globally reduce costs.

[1]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[2]  G. Cox,et al.  ~ " " " ' l I ~ " " -" . : -· " J , 2006 .

[3]  Josh Cates,et al.  Robust and efficient data management for a distributed hash table , 2003 .

[4]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[5]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[6]  David R. Karger,et al.  Diminished Chord: A Protocol for Heterogeneous Subgroup Formation in Peer-to-Peer Networks , 2004, IPTPS.

[7]  Peter Druschel,et al.  Exploiting network proximity in peer-to-peer overlay networks , 2002 .

[8]  Yaron Minsky,et al.  Set reconciliation with nearly optimal communication complexity , 2003, IEEE Trans. Inf. Theory.

[9]  KyoungSoo Park,et al.  CoMon: a mostly-scalable monitoring system for PlanetLab , 2006, OPSR.

[10]  Stan Barber,et al.  Common NNTP Extensions , 2000, RFC.

[11]  Robert Tappan Morris,et al.  Designing a DHT for Low Latency and High Throughput , 2004, NSDI.

[12]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[13]  Emin Gün Sirer,et al.  Beehive: O(1) Lookup Performance for Power-Law Query Distributions in Peer-to-Peer Overlays , 2004, NSDI.

[14]  Larry L. Peterson,et al.  Experiences building PlanetLab , 2006, OSDI '06.

[15]  Kirk L. Johnson,et al.  Overcast: reliable multicasting with on overlay network , 2000, OSDI.

[16]  Karl L. Swartz,et al.  Forecasting Disk Resource Requirements for a Usenet Server , 1993, LISA.

[17]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[18]  James Robertson,et al.  UsenetDHT: A Low Overhead Usenet Server , 2004, IPTPS.

[19]  Andreas Haeberlen,et al.  Glacier: highly durable, decentralized storage despite massive correlated failures , 2005, NSDI.

[20]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[21]  N. S. Barnett,et al.  Private communication , 1969 .

[22]  David Mazières,et al.  Democratizing Content Publication with Coral , 2004, NSDI.

[23]  David Mazières,et al.  A Toolkit for User-Level File Systems , 2001, USENIX Annual Technical Conference, General Track.

[24]  Manfred Hauswirth,et al.  NewsCache - A High-Performance Cache Implementation for Usenet News , 1999, USENIX Annual Technical Conference, General Track.

[25]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[26]  Brighten Godfrey,et al.  Heterogeneity and load balance in distributed hash tables , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[27]  John Kubiatowicz,et al.  Opendht: a public dht service , 2005 .

[28]  Venugopalan Ramasubramanian,et al.  Optimal Resource Utilization in Content Distribution Networks , 2005 .

[29]  Carsten Bormann,et al.  The Newscaster Experiment — Distributing Usenet News via Many-to-More Multicast , 1999 .

[30]  Brian Kantor,et al.  Network News Transfer Protocol , 1986, RFC.

[31]  David R. Karger,et al.  Koorde: A Simple Degree-Optimal Distributed Hash Table , 2003, IPTPS.