Handling Recoverable Temporal Violations in Scientific Workflow Systems: A Workflow Rescheduling Based Strategy

Due to the complex nature of scientific workflow systems, the violations of temporal QoS constraints often take place and may severely affect the usefulness of the execution’s results. Therefore, to deliver satisfactory QoS, temporal violations need to be recovered effectively. However, there are two fundamental issues which have so far not been well addressed: how to define recoverable temporal violations, how to design corresponding exception handling strategies. In this paper, we first propose a probability based temporal consistency model to define the temporal violations which are statistically recoverable by light-weight exception handling strategies. Afterwards, a novel Ant Colony Optimisation based two-stage workflow local rescheduling strategy (ACOWR) is proposed to handle detected recoverable temporal violations in an automatic and cost-effective fashion. The simulation experiments conducted in our scientific workflow system demonstrate the excellent performance of our handling strategy in reducing both local and global temporal violation rates.