An event service to support Grid computational environments

We believe that it is interesting to study the system and software architecture of environments which integrate the evolving ideas of computational Grids, distributed objects, Web services, peer‐to‐peer (P2P) networks and message‐oriented middleware. Such P2P Grids should seamlessly integrate users to themselves and to resources which are also linked to each other. We can abstract such environments as a distributed system of ‘clients’ which consist either of ‘users’ or ‘resources’ or proxies thereto. These clients must be linked together in a flexible, fault‐tolerant, efficient, high‐performance fashion. In this paper, we study the messaging or event system—termed Grid Event Service (GES)—that is appropriate to link the clients (both users and resources of course) together. For our purposes (registering, transporting and discovering information), events are just messages—typically with time stamps. The messaging system GES must scale over a wide variety of devices—from handheld computers at one extreme to high‐performance computers and sensors at the other. We have analyzed the requirements of several Grid services that could be built with this model, including computing and education and incorporated constraints of collaboration with a shared event model. We suggest that generalizing the well‐known publish–subscribe model is an attractive approach and here we study some of the issues to be addressed if this model is used in GES. Copyright © 2002 John Wiley & Sons, Ltd.

[1]  K. Birman,et al.  Understanding Partitions and the \ No Partition " , 1993 .

[2]  Kenneth P. Birman,et al.  A response to Cheriton and Skeen's criticism of causal and totally ordered communication , 1994, OPSR.

[3]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[4]  Geoffrey C. Fox,et al.  A grid event service , 2001 .

[5]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[6]  Richard Monson-Haefel,et al.  Java message service , 2000 .

[7]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[8]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.

[9]  Douglas C. Schmidt,et al.  The design and performance of a real-time CORBA event service , 1997, OOPSLA '97.

[10]  Guruduth Banavar,et al.  An efficient multicast protocol for content-based publish-subscribe systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[11]  Robbert van Renesse,et al.  Horus: a flexible group communication system , 1996, CACM.

[12]  Danny Dolev,et al.  The Transis approach to high availability cluster communication , 1996, CACM.

[13]  Kenneth P. Birman,et al.  Replication and fault-tolerance in the ISIS system , 1985, SOSP '85.

[14]  Bill Segall,et al.  Content Based Routing with Elvin4 , 2000 .

[15]  Patrick Th. Eugster,et al.  Effective multicast programming in large scale distributed systems , 2001, Concurr. Comput. Pract. Exp..

[16]  Jim Waldo,et al.  The Jini Specification , 1999 .

[17]  Kenneth P. Birman,et al.  Understanding partitions and the 'no partition' assumption , 1993, 1993 4th Workshop on Future Trends of Distributed Computing Systems.

[18]  Bernadette Charron-Bost,et al.  Solving Problems in the Presence of Process Crashes and Lossy Links , 1996 .

[19]  David S. Rosenblum,et al.  Content-Based Addressing and Routing: A General Model and its Application , 2000 .

[20]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[21]  David S. Rosenblum,et al.  Achieving scalability and expressiveness in an Internet-scale event notification service , 2000, PODC '00.

[22]  Geoffrey C. Fox,et al.  Grid services for earthquake science , 2002, Concurr. Comput. Pract. Exp..

[23]  Kenneth P. Birman,et al.  The role of order in distributed programs , 1989 .

[24]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[25]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[26]  John Edward Gough,et al.  Efficient Recognition of Events in a Distributed System , 1995 .

[27]  Robbert van Renesse,et al.  Hierarchical Message Stability Tracking Protocols , 1997 .

[28]  Janet Murray K12 network: global education through telecommunications , 1993, CACM.

[29]  A. Oram Peer-to-Peer , 2001 .