Modeling and predicting execution time of scientific workflows in the Grid using radial basis function neural network

With the maturity of electronic science (e-science) the scientific applications are growing to be more complex composed of a set of coordinating tasks with complex dependencies among them referred to as workflows. For optimized execution of workflows in the Grid, the high level middleware services (like task scheduler, resource broker, performance steering service etc.) need in-advance estimates of workflow execution times. However, modeling and predicting workflow execution time in the Grid is complex due to several tasks in a workflow, their distributed execution on multiple heterogeneous Grid-sites, and dynamic behaviour of the shared Grid resources. In this paper, we describe a novel method based on radial basis function neural network to model and predict workflow execution time in the Grid. We model workflows execution time in terms of attributes describing workflow structure and execution runtime information. To further refine our models, we employ principle component analysis to eliminate attributes of lesser importance. We recommend a set of only 14 attributes (as compared with total 21) to effectively model workflow execution time. Our reduced set of attributes improves the prediction accuracy by $$16\%$$16%. Results of our prediction experiments for three real-world scientific workflows are presented to show that our predictions are more accurate than the two best methods from related work so far.

[1]  Richard Gibbons,et al.  A Historical Application Profiler for Use by Parallel Schedulers , 1997, JSSPP.

[2]  Ming Wu,et al.  Network bandwidth predictor (NBP): a system for online network performance forecasting , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[3]  Laura Carrington,et al.  A performance prediction framework for scientific applications , 2003, Future Gener. Comput. Syst..

[4]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[5]  Brad Calder,et al.  Dynamic prediction of critical path instructions , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[6]  Lee C. Potter,et al.  Statistical Prediction of Task Execution Times through Analytic Benchmarking for Scheduling in a Heterogeneous Environment , 1999, IEEE Trans. Computers.

[7]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[8]  MengChu Zhou,et al.  Performance modeling and analysis of workflow , 2004, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[9]  Paul Watson,et al.  A framework for dynamically generating predictive models of workflow execution , 2013, WORKS@SC.

[10]  Thomas Fahringer,et al.  Predicting the execution time of grid workflow applications through local learning , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[11]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[12]  Sally A. McKee,et al.  An Approach to Performance Prediction for Parallel Applications , 2005, Euro-Par.

[13]  Chase Qishi Wu,et al.  On Performance Modeling and Prediction in Support of Scientific Workflow Optimization , 2011, 2011 IEEE World Congress on Services.

[14]  Frank Mueller,et al.  Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[15]  Stephen A. Jarvis,et al.  An Investigation into the Application of Different Performance Prediction Methods to Distributed Enterprise Applications , 2005, The Journal of Supercomputing.

[16]  Subhash Saini,et al.  GridFlow: workflow management for grid computing , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[17]  Bernd Mohr,et al.  KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Programs , 2003, Euro-Par.

[18]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[19]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.

[20]  Rizos Sakellariou,et al.  A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.

[21]  Jun Qin,et al.  ASKALON: A Development and Grid Computing Environment for Scientific Workflows , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[22]  Radu Prodan,et al.  Soft Benchmarks-Based Application Performance Prediction Using a Minimum Training Set , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[23]  Alberto Gómez,et al.  A review of machine learning in dynamic scheduling of flexible manufacturing systems , 2001, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[24]  Juan Chen,et al.  Improving a Local Learning Technique for QueueWait Time Predictions , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[25]  Martin Schulz,et al.  A regression-based approach to scalability prediction , 2008, ICS '08.

[26]  Ian Foster,et al.  Predicting application run times with historical information , 2004, J. Parallel Distributed Comput..

[27]  Kwang-Hoon Kim,et al.  Performance Analytic Models and Analyses for Workflow Architectures , 2001, Inf. Syst. Frontiers.

[28]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[29]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[30]  Xingfu Wu,et al.  Using kernel couplings to predict parallel application performance , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[31]  Carla E. Brodley,et al.  Predictive application-performance modeling in a computational grid environment , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[32]  Johann Eder,et al.  Probabilistic calculation of execution intervals for workflows , 2005, 12th International Symposium on Temporal Representation and Reasoning (TIME'05).

[33]  Thomas Fahringer,et al.  Optimizing execution time predictions of scientific workflow applications in the Grid through evolutionary programming , 2013, Future Gener. Comput. Syst..

[34]  Philippe Nain,et al.  Evaluation of parallel execution of program tree structures , 1984, SIGMETRICS '84.

[35]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[36]  Qiang Xu,et al.  Performance prediction with skeletons , 2008, Cluster Computing.

[37]  Stephen A. Jarvis,et al.  Performance prediction technology for agent-based resource management in grid environments , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[38]  Peter M. A. Sloot,et al.  Grid Resource Selection by Application Benchmarking for Computational Haemodynamics Applications , 2005, International Conference on Computational Science.

[39]  Andreas Wombacher,et al.  Piloting an Empirical Study on Measures forWorkflow Similarity , 2006, 2006 IEEE International Conference on Services Computing (SCC'06).

[40]  Craig B. Zilles,et al.  Accurate critical path prediction via random trace construction , 2008, CGO '08.

[41]  Yuan-Chun Jiang,et al.  A novel statistical time-series pattern based interval forecasting strategy for activity durations in workflow systems , 2011, J. Syst. Softw..

[42]  Ian T. Foster,et al.  Homeostatic and tendency-based CPU load predictions , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[43]  Daniel A. Reed,et al.  Performance Contracts: Predicting and Monitoring Grid Application Behavior , 2001, GRID.

[44]  Yuichi Inadomi,et al.  Performance prediction of large-scale parallell system and application using macro-level simulation , 2008, HiPC 2008.

[45]  Paolo Missier,et al.  Predicting the Execution Time of Workflow Activities Based on Their Input Features , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[46]  Jesús Labarta,et al.  A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[47]  Miron Livny,et al.  Online Task Resource Consumption Prediction for Scientific Workflows , 2015, Parallel Process. Lett..

[48]  Hui Li,et al.  Predicting job start times on clusters , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[49]  Thomas Fahringer,et al.  Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[50]  Radu Prodan,et al.  Benchmarking Grid Applications for Performance and Scalability Predictions , 2010 .

[51]  Marco Aurélio Amaral Henriques,et al.  Contention-sensitive static performance prediction for parallel distributed applications , 2006, Perform. Evaluation.

[52]  Sally A. McKee,et al.  Methods of inference and learning for performance modeling of parallel applications , 2007, PPoPP.

[53]  Thomas Fahringer,et al.  Performance Prophet: a performance modeling and prediction tool for parallel and distributed programs , 2005, 2005 International Conference on Parallel Processing Workshops (ICPPW'05).

[54]  Michael F. P. O'Boyle,et al.  Fast compiler optimisation evaluation using code-feature based performance prediction , 2007, CF '07.

[55]  I. Jolliffe Principal Component Analysis , 2002 .

[56]  Hui Li,et al.  Job Failure Analysis and Its Implications in a Large-Scale Production Grid , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[57]  Scott Klasky,et al.  Scientific Process Automation and Workflow Management , 2009 .

[58]  Erol Gelenbe,et al.  A performance model of block structured parallel programs , 1986 .

[59]  Xiao Liu,et al.  Forecasting Duration Intervals of Scientific Workflow Activities Based on Time-Series Patterns , 2008, 2008 IEEE Fourth International Conference on eScience.

[60]  Ian J. Taylor,et al.  Distributed computing with Triana on the Grid , 2005, Concurr. Pract. Exp..

[61]  Johan Montagnat,et al.  A Probabilistic Model to Analyse Workflow Performance on Production Grids , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[62]  Patrick H. Worley,et al.  Performance prediction for complex parallel applications , 1997 .

[63]  Adam Belloum,et al.  Execution Time Estimation for Workflow Scheduling , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.