A framework for dynamically generating predictive models of workflow execution

The ability to accurately predict the performance of software components executing within a Cloud environment is an area of intense interest to many researchers. The availability of an accurate prediction of the time taken for a piece of code to execute would be beneficial for both planning and cost optimisation purposes. To that end, this paper proposes a performance data capture and modelling architecture that can be used to generate models of code execution time that are dynamically updated as additional performance data is collected. To demonstrate the utility of this approach, the workflow engine within the e-Science Central Cloud platform has been instrumented to capture execution data with a view to generating predictive models of workflow performance. Models have been generated for both simple and more complex workflow components operating on local hardware and within a virtualised Cloud environment and the ability to generate accurate performance predictions given a number of caveats is demonstrated.

[1]  Aniruddha S. Gokhale,et al.  Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[2]  Yuan-Chun Jiang,et al.  A novel statistical time-series pattern based interval forecasting strategy for activity durations in workflow systems , 2011, J. Syst. Softw..

[3]  Tina L Hurst,et al.  Physical activity classification using the GENEA wrist-worn accelerometer. , 2012, Medicine and science in sports and exercise.

[4]  Paolo Missier,et al.  Predicting the Execution Time of Workflow Activities Based on Their Input Features , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[5]  Paul Watson,et al.  The panel of experts cloud pattern , 2011, CloudDB '11.

[6]  Xingfu Wu,et al.  Using kernel couplings to predict parallel application performance , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[7]  Robert D. van der Mei,et al.  Effective Prediction of Job Processing Times in a Large-Scale Grid Environment , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[8]  Radu Prodan,et al.  A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[9]  Paul Watson,et al.  Cloud computing for fast prediction of chemical activity , 2013, Future Gener. Comput. Syst..

[10]  Paul Watson,et al.  Developing cloud applications using the e-Science Central platform , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  Seyed Masoud Sadjadi,et al.  A modeling approach for estimating execution time of long-running scientific applications , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[12]  Paul Watson,et al.  Achieving reproducibility by combining provenance with service and workflow versioning , 2011, WORKS '11.

[13]  Marian Bubak,et al.  Prediction-based auto-scaling of scientific workflows , 2011, MGC '11.

[14]  Thomas Fahringer,et al.  Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[15]  Xingfu Wu,et al.  Prophesy: an infrastructure for performance analysis and modeling of parallel and grid applications , 2003, PERV.

[16]  Thomas Fahringer,et al.  Predicting the execution time of grid workflow applications through local learning , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[17]  Chase Qishi Wu,et al.  On Performance Modeling and Prediction in Support of Scientific Workflow Optimization , 2011, 2011 IEEE World Congress on Services.

[18]  Alexander Horsch,et al.  Separating Movement and Gravity Components in an Acceleration Signal and Implications for the Assessment of Human Daily Physical Activity , 2013, PloS one.