Self-managing federated services

We consider the problem of deploying and managing federated services that run on federated systems spanning multiple collaborative organizations. In particular, we present a peer-to-peer framework targeted to the construction of self-managing services that automatically adjust the number of service components and their placements in response to changes in the system or client loads. Our framework is completely decentralized, depending only on a modest amount of loosely synchronized global state. More specifically, our framework is comprised of a set of per-node monitoring agents and per-service-component management agents that periodically exchange information about the state of the system and of the service with each other using a gossiping protocol. Each management agent then periodically searches for configurations that are better than the current one according to an application model and explicit performance and availability targets. On finding a better configuration, an agent will enact the new configuration after a random delay to avoid possible collisions. We evaluate our framework by studying a prototype UDDI service. We show that while agents act autonomously, the service rapidly reaches a stable and appropriate configuration in response to system dynamics.

[1]  Steven McCanne,et al.  An active service framework and its application to real-time multimedia transcoding , 1998, SIGCOMM '98.

[2]  Tom Holvoet,et al.  A pheromone-based coordination mechanism applied in P2P , 2003 .

[3]  David E. Irwin,et al.  Dynamic virtual clusters in a grid site manager , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[4]  Michael Stonebraker,et al.  Contract-Based Load Management in Federated Distributed Systems , 2004, NSDI.

[5]  Tao Yang,et al.  Integrated resource management for cluster-based Internet services , 2002, OSDI.

[6]  Nicholas R. Jennings,et al.  The ARCHON System and its Applications , 1994 .

[7]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[8]  Hamideh Afsarmanesh,et al.  The Implementation Architecture of PEER Federated Object Management System , 1994 .

[9]  Benny Rochwerger,et al.  Oceano-SLA based management of a computing utility , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[10]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[11]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[12]  Thu D. Nguyen,et al.  Text-Based Content Search and Retrieval in Ad-hoc P2P Communities , 2002, NETWORKING Workshops.

[13]  Richard P. Martin,et al.  PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[14]  Anne-Marie Kermarrec,et al.  Adaptive gossip-based broadcast , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[15]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[16]  Amin Vahdat,et al.  SHARP: an architecture for secure resource peering , 2003, SOSP '03.

[17]  Hein Meling,et al.  Messor: Load-Balancing through a Swarm of Autonomous Agents , 2002, AP2PC.

[18]  David E. Culler,et al.  WebOS: operating system services for wide area applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[19]  Joseph D. Touch,et al.  Application deployment in virtual networks using the X-Bone , 2002, Proceedings DARPA Active Networks Conference and Exposition.

[20]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[21]  Geoffrey C. Fox,et al.  NaradaBrokering: A Distributed Middleware Framework and Architecture for Enabling Durable Peer-to-Peer Grids , 2003, Middleware.

[22]  David E. Culler,et al.  A blueprint for introducing disruptive technology into the Internet , 2003, CCRV.

[23]  Richard P. Martin,et al.  Autonomous replication for high availability in unstructured P2P systems , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[24]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[25]  Indranil Gupta,et al.  Fighting fire with fire: using randomized gossip to combat stochastic scalability limits , 2002 .

[26]  Robbert van Renesse,et al.  The power of epidemics: robust communication for large-scale distributed systems , 2003, CCRV.

[27]  D. E. Goldberg,et al.  Optimization and Machine Learning , 2022 .

[28]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.

[29]  NICHOLAS R. JENNINGS,et al.  An agent-based approach for building complex software systems , 2001, CACM.