INFORM: integrated flow orchestration and meta-scheduling for managed grid systems

The execution of workflow applications is a reality today in enterprise and scientific grid domains. The core middleware technologies for grids (e.g. meta-schedulers) contain sophisticated resource matching logic, but lack control flow orchestration capability. Workflow orchestrators, on the other hand, suitably control business logic but are unaware of execution requirements of tasks. Marriage of the scheduling technology with workflow management is thereby essential in the design of middleware for geographically distributed grids spanning organizational domains. However, existing endeavors use ad hoc, non-standard solutions, and also lack support for efficient data modeling and handling. In this paper, we present INFORM, an end-to-end middleware solution for grid environments, that integrates workflow orchestration and meta-scheduling. Specific issues explored are job flow modeling, transparent workflow adaptation, and data handling in job flows, through creation of assets using standardized technologies. An implementation of INFORM based on a set of industrial products from IBM is presented and validated using a Montage application. Our results demonstrate significant benefits that can be potentially achieved from such an integrated workflow management approach, when compared to competitive methodologies.

[1]  Matjaz B. Juric,et al.  Business process execution language for web services , 2004 .

[2]  David B. Shmoys,et al.  Scheduling to minimize average completion time: off-line and on-line algorithms , 1996, SODA '96.

[3]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[4]  Gregor von Laszewski Java CoG Kit Workflow Concepts for Scientific Experiments , 2005 .

[5]  Frank Leymann,et al.  Choreography for the Grid: towards fitting BPEL to the resource framework , 2006, Concurr. Comput. Pract. Exp..

[6]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[7]  Eric Gilbert,et al.  Virtual data Grid middleware services for data-intensive science: Research Articles , 2006 .

[8]  Zhijian Wu,et al.  GRID RESOURCE ALLOCATION AND MANAGEMENT BASED ON GRID RESOURCE SUPERMARKET IN GRID COMPUTING , 2005 .

[9]  M. Shields,et al.  Chapter 1 RESOURCE MANAGEMENT OF TRIANA P2P SERVICES , 2003 .

[10]  Eric Gilbert,et al.  Virtual data Grid middleware services for data‐intensive science , 2006, Concurr. Comput. Pract. Exp..

[11]  Miron Livny,et al.  Stork: making data placement a first class citizen in the grid , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[12]  Francine Berman,et al.  The GrADS Project: Software Support for High-Level Grid Application Development , 2001, Int. J. High Perform. Comput. Appl..

[13]  Liang Chen,et al.  Grid Service Orchestration Using the Business Process Execution Language (BPEL) , 2005, Journal of Grid Computing.

[14]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[15]  Eduardo Huedo,et al.  A Loosely Coupled Vision for Computational Grids , 2005, IEEE Distributed Syst. Online.

[16]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[17]  Frank Leymann,et al.  Choreography for the Grid: towards fitting BPEL to the resource framework: Research Articles , 2006 .

[18]  Miron Livny,et al.  Condor and the Grid , 2003 .

[19]  Koustuv Dasgupta,et al.  DECO: Data Replication and Execution CO-scheduling for Utility Grids , 2006, ICSOC.

[20]  Kavitha Ranganathan,et al.  Computation scheduling and data replication algorithms for data Grids , 2004 .

[21]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).