Modeling and Verifying Non-DAG Workflows for Computational Grids

The advent of grid technology has provided a promising methodology for usage of distributed resources for complex scientific workflow applications. Grid workflow management systems (GWMS) enable even a non-expert grid user to compose workflows and further provide functionality to coordinate execution of the complete workflow thereby masking the grid-specific details from the user. We have designed a non- DAG workflow specification model for workflow composition. Our model allows a user to compose a workflow using directed graphs, thereby allowing modeling of sequence, parallel, choice and iteration patterns in the workflow. We have also provided for structural verification of workflows using Petri net based analysis techniques. A workflow is said to be structurally correct if it does not have errors like deadlock and lack of synchronization. A workflow found correct by structural verification techniques can be executed correctly for all possible workflow instances. We have incorporated our workflow model into a previously existing workflow management system for Sun grid engine [1]. Experiments show that the model allows for composition of wider range of workflow applications. Also the verification procedure provides appropriate error messages to user, indicating the nature and cause of error, when a structural error is detected. We have composed and executed the workflow of EMAN [2], a real-world bio-informatics application. The experimentation results show the efficient utilization of compute power of the grid by the inherent parallel tasks in the EMAN workflow.