On the Use of Machine Learning to Predict the Time and Resources Consumed by Applications

Most data centers, clouds and grids consist of multiple generations of computing systems, each with different performance profiles, posing a challenge to job schedulers in achieving the best usage of the infrastructure. A useful piece of information for scheduling jobs, typically not available, is the extent to which applications will use available resources once they are executed. This paper comparatively assesses the suitability of several machine learning techniques for predicting spatio temporal utilization of resources by applications. Modern machine learning techniques able to handle large number of attributes are used, taking into account application- and system-specific attributes (e.g., CPU micro architecture, size and speed of memory and storage, input data characteristics and input parameters). The work also extends an existing classification tree algorithm, called Predicting Query Runtime (PQR), to the regression problem by allowing the leaves of the tree to select the best regression method for each collection of data on leaves. The new method (PQR2) yields the best average percentage error, predicting execution time, memory and disk consumption for two bioinformatics applications, BLAST and RAxML, deployed on scenarios that differ in system and usage. In specific scenarios where usage is a non-linear function of system and application attributes, certain configurations of two other machine learning algorithms, Support Vector Machine and k-nearest neighbors, also yield competitive results. In addition, experiments show that the inclusion of system performance and application-specific attributes also improves the performance of machine learning algorithms investigated.

[1]  Peter H. N. de With,et al.  Triple-C: Resource-usage prediction for semi-automatic parallelization of groups of dynamic image-processing tasks , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[2]  Andrew W. Moore,et al.  Efficient Locally Weighted Polynomial Regression Predictions , 1997, ICML.

[3]  Shonali Krishnaswamy,et al.  Estimating computation times of data-intensive applications , 2004, IEEE Distributed Systems Online.

[4]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[5]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[6]  Warren Smith Prediction Services for Distributed Computing , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[7]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[8]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[9]  David G. Stork,et al.  Pattern Classification , 1973 .

[10]  Dan Tsafrir,et al.  Backfilling Using System-Generated Predictions Rather than User Runtime Estimates , 2007, IEEE Transactions on Parallel and Distributed Systems.

[11]  Thomas Fahringer,et al.  Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[12]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[13]  Ian Foster,et al.  Predicting application run times with historical information , 2004, J. Parallel Distributed Comput..

[14]  José A. B. Fortes,et al.  On the design of a demand-based network-computing system: the Purdue University Network-Computing Hubs , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[15]  Chetan Gupta,et al.  PQR: Predicting Query Execution Times for Autonomous Workload Management , 2008, 2008 International Conference on Autonomic Computing.

[16]  Radu Prodan,et al.  A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[17]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[18]  Ivan Rodero,et al.  The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques , 2008 .

[19]  Richard Gibbons,et al.  A Historical Application Profiler for Use by Parallel Schedulers , 1997, JSSPP.

[20]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[21]  Jinbo Bi,et al.  Regression Error Characteristic Curves , 2003, ICML.

[22]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[23]  Ian Witten,et al.  Data Mining , 2000 .