TASK RESOURCE CONSUMPTION PREDICTION FOR SCIENTIFIC WORKFLOWS

Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling and resource provisioning algorithms to support efficient and reliable workflow executions. Such algorithms often assume that accurate estimates are available, but such estimates are difficult to generate in practice. In this work, we first profile five real scientific workflows, collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize workflow task requirements based on these profiles. Our method estimates task runtime, disk space, and peak memory consumption based on the size of the tasks’ input data. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets using a clustering technique. Task estimates are generated based on the ratio parameter/input data size if they are correlated, or based on the probability distribution function of the parameter. We then propose an online estimation process based on the MAPE-K loop, where task executions are monitored and estimates are updated as more information becomes available. Experimental results show that our online estimation process results in much more accurate predictions than an offline approach, where all task requirements are estimated prior to workflow execution.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  T. A. Bray,et al.  A Convenient Method for Generating Normal Variables , 1964 .

[3]  B. P. Murphy,et al.  Handbook of Methods of Applied Statistics , 1968 .

[4]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[5]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..

[6]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[7]  R. F. Freund,et al.  Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[8]  George Marsaglia,et al.  A simple method for generating gamma variables , 2000, TOMS.

[9]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[10]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[11]  Daniel S. Katz,et al.  Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand , 2004, SPIE Astronomical Telescopes + Instrumentation.

[12]  Hong Linh Truong,et al.  ASKALON: a tool set for cluster and Grid computing: Research Articles , 2005 .

[13]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[14]  Mark Greenwood,et al.  Taverna: lessons in creating a workflow environment for the life sciences: Research Articles , 2006 .

[15]  Michael Wilde,et al.  Kickstarting remote applications , 2006 .

[16]  Radu Prodan,et al.  Soft Benchmarks-Based Application Performance Prediction Using a Minimum Training Set , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[17]  Andrew A. Chien,et al.  Automatic resource specification generation for resource selection , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[18]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[19]  Adam Wierman,et al.  Scheduling despite inexact job-size information , 2008, SIGMETRICS '08.

[20]  Radu Prodan,et al.  ON THE CHARACTERISTICS OF GRID WORKFLOWS , 2008 .

[21]  Jan Broeckhove,et al.  Runtime Prediction Based Grid Scheduling of Parameter Sweep Jobs , 2008, 2008 IEEE Asia-Pacific Services Computing Conference.

[22]  Alexandru Iosup,et al.  A Trace-Based Investigation Of The Characteristics Of Grid Workflows , 2008 .

[23]  Jin-Soo Kim,et al.  Estimating Resource Needs for Time-Constrained Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[24]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[25]  Thomas Fahringer,et al.  Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[26]  Radu Prodan,et al.  A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[27]  Alexandru Iosup,et al.  Trace-based evaluation of job runtime and queue wait time predictions in grids , 2009, HPDC '09.

[28]  Kenjiro Taura,et al.  File-access patterns of data-intensive workflow applications and their implications to distributed filesystems , 2010, HPDC '10.

[29]  Alexandru Iosup,et al.  The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[30]  Michèle Sebag,et al.  The Grid Observatory , 2011, CCGRID.

[31]  Stephen Dawson,et al.  Markovian Workload Characterization for QoS Prediction in the Cloud , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[32]  David L. Hart Measuring TeraGrid: workload characterization for a high-performance computing federation , 2011, Int. J. High Perform. Comput. Appl..

[33]  Alexandru Iosup,et al.  Grid Computing Workloads , 2011, IEEE Internet Computing.

[34]  Xifeng Yan,et al.  Workload characterization and prediction in the cloud: A multiple time series approach , 2012, 2012 IEEE Network Operations and Management Symposium.

[35]  Tristan Glatard,et al.  Self-Healing of Operational Workflow Incidents on Distributed Computing Infrastructures , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[36]  S. Mahambre,et al.  Workload Characterization for Capacity Planning and Performance Management in IaaS Cloud , 2012, 2012 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM).

[37]  Weisong Shi,et al.  Workload characterization on a production Hadoop cluster: A case study on Taobao , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[38]  Tristan Glatard,et al.  A Science-Gateway Workload Archive to Study Pilot Jobs, User Activity, Bag of Tasks, Task Sub-steps, and Workflow Executions , 2012, Euro-Par Workshops.

[39]  Selmin Nurcan,et al.  Bi-criteria Workflow Tasks Allocation and Scheduling in Cloud Computing Environments , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[40]  Douglas Thain,et al.  Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids , 2012, SWEET '12.

[41]  Douglas Thain,et al.  Toward fine-grained online task characteristics estimation in scientific workflows , 2013, WORKS@SC.

[42]  Rajkumar Buyya,et al.  Adaptive workflow scheduling for dynamic grid and cloud computing environment , 2013, Concurr. Comput. Pract. Exp..

[43]  Kwang Mong Sim,et al.  A family of heuristics for agent-based elastic Cloud bag-of-tasks concurrent scheduling , 2013, Future Gener. Comput. Syst..

[44]  Antoine H. C. van Kampen,et al.  Characterizing workflow-based activity on a production e-infrastructure using provenance data , 2013, Future Gener. Comput. Syst..

[45]  Jian Li,et al.  Cost-efficient task scheduling for executing large programs in the cloud , 2013, Parallel Comput..

[46]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[47]  Tristan Glatard,et al.  Self-healing of workflow activity incidents on distributed computing infrastructures , 2013, Future Gener. Comput. Syst..

[48]  Massimiliano Pontil,et al.  Multi-task Averaging via Task Clustering , 2013, SIMBAD.

[49]  Chaokun Yan,et al.  Deadline Guarantee Enhanced Scheduling of Scientific Workflow Applications in Grid , 2013, J. Comput..

[50]  Pietro Michiardi,et al.  Revisiting Size-Based Scheduling with Estimated Job Sizes , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[51]  Vivek K. Pallipuram,et al.  Applying frequency analysis techniques to dag-based workflows to benchmark and predict resource behavior on non-dedicated clusters , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[52]  Rizos Sakellariou,et al.  A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.

[53]  Yang Liu,et al.  Soybean knowledge base (SoyKB): a web resource for integration of soybean translational genomics and molecular breeding , 2013, Nucleic Acids Res..

[54]  Adam Belloum,et al.  Execution Time Estimation for Workflow Scheduling , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.

[55]  Andreas Wilke,et al.  Workload characterization for MG-RAST metagenomic data analytics service in the cloud , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[56]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[57]  Miron Livny,et al.  Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC , 2015, ICCS.

[58]  Jörg Sander Density-Based Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.