Sector: A high performance wide area community data storage and sharing system

We introduce Sector, a system that provides data storage and sharing services over high performance wide area networks. There is now a critical need for such systems, given the rapidly increasing sizes of scientific datasets. Meanwhile, developing such systems is becoming practical because of the rapid growth of optical networks. The goal of Sector is to serve a community where users can upload and store large datasets, easily share their data with others, and perform distributed data processing over it using a very simple API. Sector uses a peer-to-peer routing mechanism to organize the participating nodes. Sector servers provide very simple and basic functions for storing, locating, and accessing data. Users can use a Sector client to access data or use the Sector client API to write distributed applications. We have successfully used Sector to store and distribute various products from Sloan Digital Sky Survey (SDSS) to astronomers around the world.

[1]  Min Cai,et al.  A Peer-to-Peer Replica Location Service Based on a Distributed Hash Table , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[2]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[3]  Arie Shoshani,et al.  Deep scientific computing requires deep data , 2004, IBM J. Res. Dev..

[4]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[5]  Terry Moore,et al.  An end-to-end approach to globally scalable network storage , 2002, SIGCOMM 2002.

[6]  Ben Y. Zhao,et al.  Pond: The OceanStore Prototype , 2003, FAST.

[7]  Huaxia Xia,et al.  RobuSTore: a distributed storage architecture with robust and high performance , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[8]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[9]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[10]  Robert L. Grossman,et al.  Distributing the Sloan Digital Sky Survey Using UDT and Sector , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[11]  Bruce M. Maggs,et al.  Globally Distributed Content Delivery , 2002, IEEE Internet Comput..

[12]  Ben Y. Zhao,et al.  Awarded Best Student Paper! - Pond: The OceanStore Prototype , 2003 .

[13]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[14]  Ian T. Foster Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, NPC.

[15]  Carl Kesselman,et al.  Wide area data replication for scientific collaborations , 2005, Int. J. High Perform. Comput. Netw..

[16]  Robert L. Grossman,et al.  UDT: UDP-based data transfer for high-speed wide area networks , 2007, Comput. Networks.

[17]  Robert L. Grossman,et al.  Exploring data parallelism and locality in wide area networks , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.