Predicting Workflow Task Execution Time in the Cloud Using A Two-Stage Machine Learning Approach

Many techniques such as scheduling and resource provisioning rely on performance prediction of workflow tasks for varying input data. However, such estimates are difficult to generate in the cloud. This paper introduces a novel two-stage machine learning approach for predicting workflow task execution times for varying input data in the cloud. In order to achieve high accuracy predictions, our approach relies on parameters reflecting runtime information and two stages of predictions. Empirical results for four real world workflow applications and several commercial cloud providers demonstrate that our approach outperforms existing prediction methods. In our experiments, our approach respectively achieves a best-case and worst-case estimation error of 1.6 and 12.2 percent, while existing methods achieved errors beyond 20 percent (for some cases even over 50 percent) in more than 75 percent of the evaluated workflow tasks. In addition, we show that the models predicted by our approach for a specific cloud can be ported with low effort to new clouds with low errors by requiring only a small number of executions.

[1]  Tomas Plachetka,et al.  POV||Ray: PERSISTENCE OF VISION PARALLEL RAYTRACER , 1998 .

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Torsten Hoefler,et al.  PEMOGEN: Automatic adaptive performance modeling during program runtime , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  Carla E. Brodley,et al.  Predictive application-performance modeling in a computational grid environment , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[6]  Byoung-Dai Lee,et al.  Run-time prediction of parallel applications on shared environments , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[7]  Rizos Sakellariou,et al.  A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.

[8]  Adolfy Hoisie,et al.  Palm: easing the burden of analytical performance modeling , 2014, ICS '14.

[9]  Peter A. Dinda Online prediction of the running time of tasks , 2001, SIGMETRICS '01.

[10]  Raymond J. Mooney,et al.  Symbolic and neural learning algorithms: An experimental comparison , 1991, Machine Learning.

[11]  José A. B. Fortes,et al.  On the Use of Machine Learning to Predict the Time and Resources Consumed by Applications , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[12]  Steven Salzberg,et al.  Programs for Machine Learning , 2004 .

[13]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[14]  Xiaowei Yang,et al.  CloudProphet: towards application performance prediction in cloud , 2011, SIGCOMM.

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Seyong Lee,et al.  COMPASS: A Framework for Automated Performance Modeling and Prediction , 2015, ICS.

[18]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[19]  David C. Levy,et al.  Task profiling model for load profile prediction , 2011, Future Gener. Comput. Syst..

[20]  Sergey V. Kovalchuk,et al.  Towards Better Workflow Execution Time Estimation , 2014 .

[21]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[22]  Miron Livny,et al.  Online Task Resource Consumption Prediction for Scientific Workflows , 2015, Parallel Process. Lett..

[23]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[24]  Paolo Missier,et al.  Predicting the Execution Time of Workflow Activities Based on Their Input Features , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[25]  Jeffrey S. Vetter,et al.  Aspen: A domain specific language for performance modeling , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[26]  Daniel S. Katz,et al.  Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking , 2009, Int. J. Comput. Sci. Eng..

[27]  Carlos García Garino,et al.  Ensemble learning of runtime prediction models for gene-expression analysis workflows , 2015, Cluster Computing.

[28]  Jun Qin,et al.  ASKALON: a Grid application development and computing environment , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[29]  Jun Qin,et al.  Scientific Workflows: Programming, Optimization, and Synthesis with ASKALON and AWDL , 2012 .

[30]  Peter A. Dinda Online Prediction of the Running Time of Tasks , 2004, Cluster Computing.

[31]  Ian Foster,et al.  Predicting application run times with historical information , 2004, J. Parallel Distributed Comput..

[32]  Peter A. Dinda,et al.  An evaluation of linear models for host load prediction , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[33]  Radu Prodan,et al.  Multi-objective workflow scheduling in Amazon EC2 , 2014, Cluster Computing.

[34]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[35]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.