Omnivore: Integration of Grid Meta-Scheduling and Peer-to-Peer Technologies

Dedicated servers remain to be a common constituent of Grid job scheduling architectures, forcing site administrators to make compromises between administrative expenses and system reliability. Apart from requiring administrative attention, dedicated servers create single points of failure and should not be subjected to network churn. This paper presents the design and implementation of Omnivore, a fully decentralized job scheduling system, built on a peer-to-peer based meta-scheduler. Omnivore is able to cope both with node failures and network churn, eliminating the need for central administration and continuous resource availability. It is integrated into the Grid landscape (especially the Globus Toolkit 4) by means of the GridWay meta- scheduler to provide scalable distributed scheduling, replicated storage and system monitoring capabilities. Results obtained from an experimental evaluation of our implementation show that Omnivore is both scalable and resilient in the presence of node failures and network churn.

[1]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[2]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[3]  Mario Lauria,et al.  Application-specific scheduling for the organic grid , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[4]  Pierre Sens,et al.  Pastis: A Highly-Scalable Multi-user Peer-to-Peer File System , 2005, Euro-Par.

[5]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[6]  Hein Meling,et al.  Anthill: a framework for the development of agent-based peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[7]  Ben Y. Zhao,et al.  Towards a Common API for Structured Peer-to-Peer Overlays , 2003, IPTPS.

[8]  Fabio Kon,et al.  Distributed data storage for opportunistic grids , 2006, MDS '06.

[9]  Dan Suciu,et al.  What Can Database Do for Peer-to-Peer? , 2001, WebDB.

[10]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[11]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[12]  Bernd Freisleben,et al.  A peer-to-peer meta-scheduler for service-oriented grid environments , 2007, GridNets '07.

[13]  Miguel Castro,et al.  Scribe: a large-scale and decentralized application-level multicast infrastructure , 2002, IEEE J. Sel. Areas Commun..

[14]  George Varghese,et al.  Cone : A Distributed Heap Approach to Resource Selection , 2004 .

[15]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[16]  Krishna P. Gummadi,et al.  An analysis of Internet content delivery systems , 2002, OPSR.

[17]  Ian T. Foster,et al.  On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing , 2003, IPTPS.

[18]  Eduardo Huedo,et al.  A framework for adaptive execution in grids , 2004, Softw. Pract. Exp..