A Fault Tolerance Scheme for Hierarchical Dynamic Schedulers in Grids

In dynamic grid environment failures (e.g. link down, resource failures) are frequent. We present a fault tolerance scheme for hierarchical dynamic scheduler (HDS) for grid workflow applications. In HDS all resources are arranged in a hierarchy tree and each resource acts as a scheduler. The fault tolerance scheme is fully distributed and is responsible for maintaining the hierarchy tree in the presence of failures. Our fault tolerance scheme handles root failures specially, which avoids root becoming single point of failure. The resources detecting failures are responsible for taking appropriate actions. Our fault tolerance scheme uses randomization to get rid of multiple simultaneous failures. Our simulation results show that the recovery process is fast and the failures affect minimally to the scheduling process.

[1]  Hirozumi Yamaguchi,et al.  On designing end-user multicast for multiple video sources , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[2]  Lixia Zhang,et al.  Host multicast: a framework for delivering multicast to end users , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[3]  Bobby Bhattacharjee,et al.  Scalable application layer multicast , 2002, SIGCOMM 2002.

[4]  Sanjeev K. Aggarwal,et al.  A workflow editor and scheduler for composing applications on computational grids , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[5]  Simon See,et al.  Modeling and Verifying Non-DAG Workflows for Computational Grids , 2007, 2007 IEEE Congress on Services (Services 2007).

[6]  Carl Kesselman,et al.  Optimizing Grid-Based Workflow Execution , 2005, Journal of Grid Computing.

[7]  Rajkumar Buyya,et al.  A taxonomy of scientific workflow systems for grid computing , 2005, SGMD.

[8]  Kien A. Hua,et al.  A peer-to-peer architecture for media streaming , 2004, IEEE Journal on Selected Areas in Communications.