Peer-to-Peer Prefix Tree for Large Scale Service Discovery

The problem studied in this thesis is the service discovery on platforms distributed at large scale, a service being a computing service (software components, scientific computing libraries, or binaries) offered with some characteristics and a performance level related to the hardware supporting it. Traditional approaches, designed for reliable and small scale environments, rely upon centralized solutions, unable to scale well in geographically distributed unreliable platforms. Our contribution centers around three main parts. 1) We propose a novel approach called DLPT (Distributed Lexicographic Placement Table), whose design is inspired by peer-to-peer systems. It calls upon an indexing system structured as a prefix tree. This structure supports multi-attribute range queries. 2) We study the mapping of nodes of this tree onto heterogeneous processors of the dynamic underlying network. We propose and adapt some load balancing heuristics for this kind of architectures. 3) Our architecture, targeted for platforms within which processors are unreliable and constantly joining and leaving the network, requires fault-tolerance mechanisms. Replication, usually used, is costly and unable to manage transient faults. We propose alternative best-effort mechanisms based on the self-stabilization theory for the construction and maintenance of prefix trees in a peer-to-peer environment. Among the mechanisms provided, one is proven to be snap-stabilizing. This means that the tree is rebuilt in an optimal time after an arbitrary number of faults. This approach is written in a coarse grain communication model and assumes several restrictions on initial topology handled, making it hard to implement on real platforms. To address these drawbacks, another self-stabilizing protocol is given for actual message-passing environments. Finally, we present a software prototype of this architecture and its first promising experiments on the Grid'5000 platform.

[1]  Beng Chin Ooi,et al.  Supporting multi-dimensional range queries in peer-to-peer systems , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[2]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[3]  Luigi Liquori,et al.  Powerful resource discovery for Arigatoni overlay network , 2008, Future Gener. Comput. Syst..

[4]  Christine Morin,et al.  Vigne: Executing Easily and Efficiently a Wide Range of Distributed Applications in Grids , 2007, Euro-Par.

[5]  Manish Jain,et al.  Pathload: A Measurement Tool for End-to-End Available Bandwidth , 2002 .

[6]  Jianjun Hu,et al.  A decentralized quickest response algorithm for grid service discovery , 2007, InfoScale '07.

[7]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[8]  Margo I. Seltzer,et al.  Distributed, secure load balancing with skew, heterogeneity and churn , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[9]  Pierre Fraigniaud,et al.  The content-addressable network d2b , 2003 .

[10]  综合社会科学 World Community Grid , 2010 .

[11]  Mitsuhisa Sato,et al.  Design Issues of Network Enabled Server Systems for the Grid , 2000, GRID.

[12]  Yoshio Tanaka,et al.  Design, implementation and performance evaluation of GridRPC programming middleware for a large-scale computational grid , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[13]  Jack J. Dongarra,et al.  GridSolve: The Evolution of A Network Enabled Solver , 2006, Grid-Based Problem Solving Environments.

[14]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[15]  Boaz Patt-Shamir,et al.  Self-stabilizing end-to-end communication , 1996, J. High Speed Networks.

[16]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[17]  Scott Shenker,et al.  Complex Queries in Dht-based Peer-to-peer Networks , 2002 .

[18]  Eddy Caron,et al.  Cosmological Simulations using Grid Middleware , 2006, 2007 IEEE International Parallel and Distributed Processing Symposium.

[19]  Amos Israeli,et al.  Self-Stabilization of Dynamic Systems Assuming only Read/Write Atomicity , 1990, PODC.

[20]  Artur Andrzejak,et al.  Scalable, efficient range queries for grid information services , 2002, Proceedings. Second International Conference on Peer-to-Peer Computing,.

[21]  Anish Arora,et al.  Stabilization-Preserving Atomicity Refinement , 2002, J. Parallel Distributed Comput..

[22]  George Giakkoupis,et al.  A scheme for load balancing in heterogenous distributed hash tables , 2005, PODC '05.

[23]  Ian Foster,et al.  Monitoring and Discovery in a Web Services Framework: Functionality and Performance of Globus Toolkit MDS4 , 2006 .

[24]  PlaleBeth,et al.  Service-Oriented Environments for Dynamically Interacting with Mesoscale Weather , 2005 .

[25]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[26]  Ajoy Kumar Datta,et al.  Self-Stabilization in Tree-Structured Peer-to-Peer Service Discovery Systems , 2008, 2008 Symposium on Reliable Distributed Systems.

[27]  Ajoy Kumar Datta,et al.  Anonymous publish/subscribe in P2P networks , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[28]  Magnus Karlsson,et al.  Turning heterogeneity into an advantage in overlay routing , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[29]  T. Howes,et al.  Understanding and Deploying LDAP Directory Services , 2003 .

[30]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[31]  Min Cai,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Journal of Grid Computing.

[32]  Gurmeet Singh Manku,et al.  Balanced binary trees for ID management and load balance in distributed hash tables , 2004, PODC '04.

[33]  Chi Zhang,et al.  Brushwood: Distributed Trees in Peer-to-Peer Systems , 2005, IPTPS.

[34]  Gurmeet Singh Manku,et al.  Decentralized algorithms using both local and random probes for P2P load balancing , 2005, SPAA '05.

[35]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[36]  Guillaume Urvoy-Keller,et al.  Topology-Centric Look-Up Service , 2003, Networked Group Communication.

[37]  James Demmel,et al.  A Supernodal Approach to Sparse Partial Pivoting , 1999, SIAM J. Matrix Anal. Appl..

[38]  Guihai Chen,et al.  Cycloid: a constant-degree and lookup-efficient P2P overlay network , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[39]  Franck Petit,et al.  A peer-to-peer extension of network-enabled server systems , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[40]  Sujata Banerjee,et al.  NodeWiz: peer-to-peer resource discovery for grids , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[41]  Pradip K. Srimani,et al.  Self-stabilizing Publish/Subscribe Protocol for P2P Networks , 2005, IWDC.

[42]  Toshimitsu Masuzawa,et al.  Available stabilizing heaps , 2001, Inf. Process. Lett..

[43]  Jeffrey Considine,et al.  Simple Load Balancing for Distributed Hash Tables , 2003, IPTPS.

[44]  Pascale Vicat-Blanc Primet,et al.  HIPernet: a decentralized security infrastructure for large scale grid environments , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[45]  Patrick Amestoy,et al.  Management of Services Based on a Semantic Description Within the GRID-TLSE Project , 2006, VECPAR.

[46]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[47]  Ajoy Kumar Datta,et al.  State-optimal snap-stabilizing PIF in tree networks , 1999, Proceedings 19th IEEE International Conference on Distributed Computing Systems.

[48]  Sheng-De Wang,et al.  Jelly: a dynamic hierarchical P2P overlay network with load balance and locality , 2004, 24th International Conference on Distributed Computing Systems Workshops, 2004. Proceedings..

[49]  Sriram Ramabhadran,et al.  Prefix Hash Tree An Indexing Data Structure over Distributed Hash Tables , 2004, PODC 2004.

[50]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[51]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[52]  Eddy Caron,et al.  Diet: A Scalable Toolbox to Build Network Enabled Servers on the Grid , 2006, Int. J. High Perform. Comput. Appl..

[53]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[54]  Ernest J. H. Chang,et al.  Echo Algorithms: Depth Parallel Operations on General Graphs , 1982, IEEE Transactions on Software Engineering.

[55]  Yiming Hu,et al.  Towards efficient load balancing in structured P2P systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[56]  Franck Petit,et al.  A hierarchical resource reservation algorithm for network enabled servers , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[57]  Guillaume Urvoy-Keller,et al.  Hierarchical Peer-To-Peer Systems , 2003, Parallel Process. Lett..

[58]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[59]  Yoshio Tanaka,et al.  Interoperability Testing for The GridRPC API Specification , 2007 .

[60]  Desh Ranjan,et al.  Space-Filling Curves and Their Use in the Design of Geometric Data Structures , 1997, Theor. Comput. Sci..

[61]  Rajmohan Rajaraman,et al.  Accessing Nearby Copies of Replicated Objects in a Distributed Environment , 1997, SPAA '97.

[62]  Luigi Liquori,et al.  Logical Networks: Towards Foundations for Programmable Overlay Networks and Overlay Computing Systems , 2007, TGC.

[63]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[64]  Anish Arora,et al.  Distributed Reset , 1994, IEEE Trans. Computers.

[65]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[66]  R.R. Joshi,et al.  GBTK: a toolkit for grid implementation of BLAST , 2004, Proceedings. Seventh International Conference on High Performance Computing and Grid in Asia Pacific Region, 2004..

[67]  Ajoy Kumar Datta,et al.  Snap-Stabilizing Optimal Binary Search Tree , 2005, Self-Stabilizing Systems.

[68]  Georges Da Costa,et al.  2005 IEEE International Symposium on Cluster Computing and the Grid , 2005, CCGRID.

[69]  Eddy Caron,et al.  Dynamic Prefix Tree for Service Discovery within Large Scale Grids , 2006, Sixth IEEE International Conference on Peer-to-Peer Computing (P2P'06).

[70]  Chris G. Knight,et al.  Association of parameter, software, and hardware variation with large-scale behavior across 57,000 climate models , 2007, Proceedings of the National Academy of Sciences.

[71]  Laxmikant V. Kale,et al.  Biomolecular Modeling in the Era of Petascale Computing , 2007 .

[72]  Adrian Segall,et al.  Distributed network protocols , 1983, IEEE Trans. Inf. Theory.

[73]  Eddy Caron,et al.  Parallelization and Distribution Strategies of Large Bioinformatics Requests over the Grid , 2008, ICA3PP.

[74]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[75]  James Aspnes,et al.  Skip graphs , 2003, SODA '03.

[76]  Douglas S. Reeves,et al.  Self-stabilizing structured ring topology P2P systems , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[77]  Karl Aberer,et al.  Range queries in trie-structured overlays , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[78]  Franck Cappello,et al.  Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..

[79]  Eddy Caron,et al.  Enhancing Computational Grids with Peer-to-Peer Technology for Large Scale Service Discovery , 2007, Journal of Grid Computing.

[80]  Brian Tierney,et al.  Enabling network measurement portability through a hierarchy of characteristics , 2003, Proceedings. First Latin American Web Congress.

[81]  Zhiyong Xu,et al.  HIERAS: a DHT based hierarchical P2P routing algorithm , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[82]  Guillaume Urvoy-Keller,et al.  Data indexing in peer-to-peer DHT networks , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[83]  Ian T. Foster,et al.  On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing , 2003, IPTPS.

[84]  Jean-Yves L'Excellent,et al.  MUMPS: A Multifrontal Massively Parallel Solver , 2002 .

[85]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[86]  Karl Aberer,et al.  Multifaceted Simultaneous Load Balancing in DHT-Based P2P Systems: A New Game with Old Balls and Bins , 2005, Self-star Properties in Complex Information Systems.

[87]  Timothy L. Harris,et al.  XenoSearch: distributed resource discovery in the XenoServer open platform , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[88]  Donald R. Morrison,et al.  PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.

[89]  Cheng-Zhong Xu,et al.  Random choices for churn resilient load balancing in peer-to-peer networks , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[90]  Richard M. Karp,et al.  Load balancing in dynamic structured P2P systems , 2004, IEEE INFOCOM 2004.

[91]  Jon Sigel,et al.  CORBA Fundamentals and Programming , 1996 .

[92]  G. Bryan,et al.  Simulating Cosmological Evolution with Enzo , 2007, 0705.1556.

[93]  Franck Petit,et al.  Enabling snap-stabilization , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[94]  David R. Karger,et al.  Koorde: A Simple Degree-Optimal Distributed Hash Table , 2003, IPTPS.

[95]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.

[96]  David R. Karger,et al.  INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery , 2002, Pervasive.

[97]  Manish Parashar,et al.  Squid: Enabling search in DHT-based systems , 2008, J. Parallel Distributed Comput..

[98]  Moti Yung,et al.  Memory-Efficient Self Stabilizing Protocols for General Networks , 1990, WDAG.

[99]  Gurmeet Singh Manku,et al.  Symphony: Distributed Hashing in a Small World , 2003, USENIX Symposium on Internet Technologies and Systems.

[100]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[101]  Francis G. McCabe,et al.  Reference Model for Service Oriented Architecture 1.0 , 2006 .

[102]  Henri E. Bal,et al.  ARRG: real-world gossiping , 2007, HPDC '07.

[103]  Jennifer L. Welch,et al.  Crash Resilient Communication in Dynamic Networks , 1997, IEEE Trans. Computers.

[104]  Moni Naor,et al.  Novel architectures for P2P applications: the continuous-discrete approach , 2003, SPAA '03.

[105]  kc claffy,et al.  Bandwidth estimation: metrics, measurement techniques, and tools , 2003, IEEE Netw..

[106]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[107]  Boaz Patt-Shamir,et al.  Time optimal self-stabilizing synchronization , 1993, STOC.

[108]  Sébastien Tixeuil,et al.  Snap-stabilization in message-passing systems , 2008, PODC '08.

[109]  Shlomi Dolev,et al.  HyperTree for self-stabilizing peer-to-peer systems , 2004, Third IEEE International Symposium on Network Computing and Applications, 2004. (NCA 2004). Proceedings..

[110]  Thomas Hérault,et al.  A Model for Large Scale Self-Stabilization , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[111]  Hector Garcia-Molina,et al.  One torus to rule them all: multi-dimensional queries in P2P systems , 2004, WebDB '04.

[112]  Imran A. Pirwani,et al.  A Composite Stabilizing Data Structure , 2001, WSS.

[113]  Jack Dongarra,et al.  ScaLAPACK Users' Guide , 1987 .

[114]  Shing-Tsaan Huang,et al.  A Self-Stabilizing Algorithm for Constructing Spanning Trees , 1991, Inf. Process. Lett..

[115]  M. Berzins,et al.  Scalable Parallel AMR for the Uintah Multi-Physics Code , 2007 .

[116]  Moni Naor,et al.  Viceroy: a scalable and dynamic emulation of the butterfly , 2002, PODC '02.

[117]  David Abramson,et al.  A Scalable and Efficient Prefix-Based Lookup Mechanism for Large-Scale Grids , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[118]  Eli Upfal,et al.  Balanced Allocations , 1999, SIAM J. Comput..

[119]  Ben Y. Zhao,et al.  Brocade: Landmark Routing on Overlay Networks , 2002, IPTPS.

[120]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[121]  Donald B. Batchelor High performance computing in magnetic fusion energy research , 2006, SC.

[122]  David A. Bader Petascale Computing: Algorithms and Applications , 2007 .

[123]  Zhichen Xu,et al.  Building Low-maintenance Expressways for P2P Systems , 2002 .

[124]  Amin Vahdat,et al.  Distributed Resource Discovery on PlanetLab with SWORD , 2004, WORLDS.

[125]  Shlomi Dolev,et al.  Self Stabilization , 2004, J. Aerosp. Comput. Inf. Commun..

[126]  David R. Karger,et al.  Simple Efficient Load-Balancing Algorithms for Peer-to-Peer Systems , 2004, SPAA '04.

[127]  Friedhelm Meyer auf der Heide,et al.  Dynamic Load Balancing in Distributed Hash Tables , 2005, IPTPS.

[128]  Eddy Caron,et al.  Ocean-Atmosphere Modelization over the Grid , 2008, 2008 37th International Conference on Parallel Processing.

[129]  T. Herman,et al.  A stabilizing search tree with availability properties , 2001, Proceedings 5th International Symposium on Autonomous Decentralized Systems.

[130]  Eric Pouyoul,et al.  Project JXTA: A Loosely-Consistent DHT Rendezvous Walker , 2002 .

[131]  Amos Israeli,et al.  Self-stabilization of dynamic systems assuming only read/write atomicity , 1990, PODC '90.

[132]  Sajal K. Das,et al.  A de-centralized scheduling and load balancing algorithm for heterogeneous grid environments , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[133]  D. Janaki Ram,et al.  Vishwa: A reconfigurable P2P middleware for Grid Computations , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[134]  Brighten Godfrey,et al.  Heterogeneity and load balance in distributed hash tables , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[135]  Bin Liu,et al.  Supporting Complex Multi-Dimensional Queries in P2P Systems , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[136]  Theoni Pitoura,et al.  Towards a Unifying Framework for Complex Query Processing over Structured Peer-to-Peer Data Networks , 2003, DBISP2P.

[137]  Rudolf Bayer,et al.  Binary B-trees for virtual memory , 1971, SIGFIDET '71.

[138]  Franck Petit,et al.  A Repair Mechanism for Fault-Tolerance for Tree-Structured Peer-to-Peer Systems , 2006, HiPC.

[139]  Gustavo Alonso,et al.  Web Services: Concepts, Architectures and Applications , 2009 .