Design and implementation tradeoffs for wide-area resource discovery

This paper describes the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intragroup, intergroup, and per-node characteristics, along with the utility that the application derives from various ranges of values of those characteristics. This design gives users the flexibility to find geographically distributed resources for applications that are sensitive to both node and network characteristics, and allows the system to rank acceptable configurations based on their quality for that application. We explore a variety of architectures to deliver SWORD's functionality in a scalable and highly-available manner. A 1000-node ModelNet evaluation using a workload of measurements collected from PlanetLab shows that an architecture based on 4-node server cluster sites at network peering facilities outperforms a decentralized DHT-based resource discovery infrastructure for all but the smallest number of sites. While such a centralized architecture shows significant promise, we find that our decentralized implementation, both in emulation and running continuously on over 200 PlanetLab nodes, performs well while benefiting from the DHT's self-healing properties.

[1]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[2]  Robert Tappan Morris,et al.  A performance vs. cost framework for evaluating DHT design tradeoffs under churn , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[3]  David Mazières,et al.  Democratizing Content Publication with Coral , 2004, NSDI.

[4]  Zhichen Xu,et al.  pSearch: information retrieval in structured overlays , 2003, CCRV.

[5]  Steve Fisher Relational model for information and monitoring , 2001 .

[6]  James Aspnes,et al.  Skip graphs , 2003, SODA '03.

[7]  Jennifer M. Schopf,et al.  Performance analysis of the Globus Toolkit Monitoring and Discovery Service, MDS2 , 2004, IEEE International Conference on Performance, Computing, and Communications, 2004.

[8]  Sandhya Dwarkadas,et al.  Hybrid Global-Local Indexing for Efficient Peer-to-Peer Information Retrieval , 2004, NSDI.

[9]  Hui Zhang,et al.  A Network Positioning System for the Internet , 2004, USENIX Annual Technical Conference, General Track.

[10]  Chuang Liu,et al.  A constraint language approach to matchmaking , 2004, 14th International Workshop Research Issues on Data Engineering: Web Services for e-Commerce and e-Government Applications, 2004. Proceedings..

[11]  CasanovaHenri,et al.  The encyclopedia of life project , 2004 .

[12]  David Patterson,et al.  Service placement in shared wide-area platforms , 2005, SOSP '05.

[13]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[14]  Jason Lee,et al.  The Grid2003 production grid: principles and practice , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[15]  Rajesh Raman,et al.  Policy driven heterogeneous resource co-allocation with Gangmatching , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[16]  Divyakant Agrawal,et al.  Approximate Range Selection Queries in Peer-to-Peer Systems , 2003, CIDR.

[17]  Balachander Krishnamurthy,et al.  On network-aware clustering of Web clients , 2000, SIGCOMM 2000.

[18]  Amin Vahdat,et al.  PlanetLab application management using plush , 2006, OPSR.

[19]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[20]  Timothy L. Harris,et al.  XenoSearch: distributed resource discovery in the XenoServer open platform , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[21]  Amin Vahdat,et al.  Efficient Peer-to-Peer Keyword Searching , 2003, Middleware.

[22]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[23]  Toshihide Ibaraki,et al.  Resource allocation problems - algorithmic approaches , 1988, MIT Press series in the foundations of computing.

[24]  Amin Vahdat,et al.  Scalable Wide-Area Resource Discovery , 2004 .

[25]  Christian Scheideler,et al.  Peer-to-peer systems for prefix search , 2003, PODC '03.

[26]  Andrew A. Chien,et al.  Efficient resource description and high quality selection for virtual grids , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[27]  Stephen A. Brodsky Content addressable networks , 1996 .

[28]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[29]  Scott Shenker,et al.  Spurring Adoption of DHTs with OpenHash, a Public DHT Service , 2004, IPTPS.

[30]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM 2004.

[31]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[32]  Peter Steenkiste,et al.  Network-Sensitive Service Discovery , 2003, Journal of Grid Computing.

[33]  Larry L. Peterson,et al.  The dark side of the Web , 2004, Comput. Commun. Rev..

[34]  Scott Shenker,et al.  Querying the Internet with PIER , 2003, VLDB.

[35]  Larry L. Peterson,et al.  Sophia: an Information Plane for networked systems , 2004, Comput. Commun. Rev..

[36]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[37]  Randy H. Katz,et al.  An algebraic approach to practical and scalable overlay network monitoring , 2004, SIGCOMM 2004.

[38]  Walter Willinger,et al.  Towards capturing representative AS-level Internet topologies , 2004, Comput. Networks.

[39]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[40]  Hui Zhang,et al.  Predicting Internet network distance with coordinates-based approaches , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[41]  Donald F. Ferguson,et al.  Economic models for allocating resources in computer systems , 1996 .

[42]  Dong Lu,et al.  Nondeterministic Queries in a Relational Grid Information Service , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[43]  David R. Karger,et al.  INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery , 2002, Pervasive.

[44]  Amin Vahdat,et al.  Resource Allocation in Federated Distributed Computing Infrastructures , 2004 .

[45]  H. V. Jagadish,et al.  Linear clustering of objects with multiple attributes , 1990, SIGMOD '90.

[46]  Ian T. Foster,et al.  SNAP: A Protocol for Negotiating Service Level Agreements and Coordinating Resource Management in Distributed Systems , 2002, JSSPP.

[47]  Sriram Ramabhadran,et al.  A case study in building layered DHT applications , 2005, SIGCOMM '05.

[48]  Scott Shenker,et al.  Fixing the Embarrassing Slowness of OpenDHT on PlanetLab , 2005, WORLDS.

[49]  R. Wolski,et al.  GridSAT: A Chaff-based Distributed SAT Solver for the Grid , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[50]  David E. Culler,et al.  Operating Systems Support for Planetary-Scale Network Services , 2004, NSDI.

[51]  M. Frans Kaashoek,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM 2004.

[52]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[53]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[54]  Amin Vahdat,et al.  SHARP: an architecture for secure resource peering , 2003, SOSP '03.

[55]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[56]  David E. Culler,et al.  Wide area cluster monitoring with Ganglia , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[57]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[58]  David R. Karger,et al.  Simple Efficient Load-Balancing Algorithms for Peer-to-Peer Systems , 2004, SPAA '04.