Dataflow-Based Scheduling for Scientific Workflows in HPC with Storage Constraints

In high-performance computing (HPC), workflow-based workloads are usually data intensive for exploratory analysis of a scientific computation problem that may involve a large parameter space. To achieve the best performance, storage resource constraint is always a pragmatic concern in reality as the potential problem space scale, especially in big data science, as well as its required dataset are ever growing to outpace any increasing rate of storage capacity. Therefore, the workflow computation in a HPC environment with finite storage resources is still a practical topic that is worthwhile studying. To this end, we propose a novel scheduling framework that enhances the scheduling policies of Versioned Name Space and Overwrite-Safe Concurrency, introduced in our earlier work, with abilities to handle the deadlock problem in workflow computation with finite storage constraints. We achieve this goal by leveraging the data dependency information of the workflow to integrate a collection of deadlock resolution algorithms into the workflow scheduler. With such integration, after extensive simulation-based studies we conclude that the enhanced scheduling policies can solve the deadlock problem introduced by the storage constraints caused by big data overflow. More interestingly, we demonstrate that our enhanced scheduling policies perform better than the cases where only pure deadlock algorithms are applied when storage is highly constrained in terms of makespan performance.

[1]  Arnold L. Rosenberg,et al.  On scheduling mesh-structured computations for Internet-based computing , 2004, IEEE Transactions on Computers.

[2]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[3]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[4]  Dick Eckhouse Proceedings of the 14th annual workshop on Microprogramming, MICRO 1981, Chatham (Cape Cod), Massachusetts, USA , 1981, MICRO.

[5]  Marc Spraragen,et al.  Simplifying construction of complex workflows for non-expert users of the Southern California Earthquake Center Community Modeling Environment , 2005, SGMD.

[6]  Johan Montagnat,et al.  Grid-enabled workflows for data intensive medical applications , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[7]  Rajkumar Buyya,et al.  Scheduling of Scientific Workflows on Data Grids , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[8]  Daniel Marcu,et al.  Machine translation in the year 2004 , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Li Zhao,et al.  SCEC CyberShake Workflows - Automating Probabilistic Seismic Hazard Analysis Calculations , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[10]  Yang Wang,et al.  Dataflow detection and applications to workflow scheduling , 2011, Concurr. Comput. Pract. Exp..

[11]  Ewa Deelman,et al.  Integration of Workflow Partitioning and Resource Provisioning , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[12]  Ewa Deelman,et al.  Scientific workflows and clouds , 2010, ACM Crossroads.

[13]  Sheau-Dong Lang An Extended Banker's Algorithm for Deadlock Avoidance , 1999, IEEE Trans. Software Eng..

[14]  Andrea C. Arpaci-Dusseau,et al.  Explicit Control in the Batch-Aware Distributed File System , 2004, NSDI.

[15]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach (4. ed.) , 2007 .

[16]  Ewa Deelman,et al.  Partitioning and Scheduling Workflows across Multiple Sites with Storage Constraints , 2011, PPAM.

[17]  Yang Wang,et al.  DDS: A deadlock detection-based scheduling algorithm for workflow computations in HPC systems with storage constraints , 2013, Parallel Comput..

[18]  Ann L. Chervenak,et al.  Scheduling data-intensive workflows on storage constrained resources , 2009, WORKS '09.

[19]  Péter Kacsuk,et al.  Advanced computer architectures - a design space approach , 1997, International computer science series.

[20]  Cheng Wu,et al.  An integrated resource management and scheduling system for grid data streaming applications , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[21]  David J. DeWitt,et al.  Scientific data management in the coming decade , 2005, SGMD.

[22]  Yang Wang,et al.  Maximizing Active Storage Resources with Deadlock Avoidance in Workflow-Based Computations , 2013, IEEE Transactions on Computers.