Adaptive Resource Allocation with Job Runtime Uncertainty

In this paper, we address the problem of dynamic resource allocation in presence of job runtime uncertainty. We develop an execution delay model for runtime prediction, and design an adaptive stochastic allocation strategy, named Pareto Fractal Flow Predictor (PFFP). We conduct a comprehensive performance evaluation study of the PFFP strategy on real production traces, and compare it with other well-known non-clairvoyant strategies over two metrics. In order to choose the best strategy, we perform bi-objective analysis according to a degradation methodology. To analyze possible biasing results and negative effects of allowing a small portion of the problem instances with large deviation to dominate the conclusions, we present performance profiles of the strategies. We show that PFFP performs well in different scenarios with a variety of workloads and distributed resources.

[1]  Nicole Megow,et al.  Approximation in Preemptive Stochastic Online Scheduling , 2006, ESA.

[2]  Uwe Schwiegelshohn,et al.  Job Allocation Strategies with User Run Time Estimates for Online Scheduling in Hierarchical Grids , 2011, Journal of Grid Computing.

[3]  Carey L. Williamson,et al.  Statistical multiplexing of self-similar video streams: simulation study and performance results , 1998, Proceedings. Sixth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.98TB100247).

[4]  César Vargas Rosales,et al.  Self-similarity and Multidimensionality: Tools for Performance Modelling of Distributed Infrastructure , 2008, OTM Conferences.

[5]  Ramón M. Rodríguez-Dagnino,et al.  From commodity computers to high-performance environments: scalability analysis using self-similarity, large deviations and heavy-tails , 2010, Grid 2010.

[6]  Andrei Tchernykh,et al.  Algorithms for dynamic scheduling of unit execution time tasks , 2003, Eur. J. Oper. Res..

[7]  Andrei Tchernykh,et al.  Two Level Job-Scheduling Strategies for a Computational Grid , 2005, PPAM.

[8]  L. Leemis,et al.  Minimum Kolmogorov–Smirnov test statistic parameter estimates , 2006 .

[9]  Upendra Dave,et al.  Applied Probability and Queues , 1987 .

[10]  Ramón M. Rodríguez-Dagnino,et al.  A gamma fractal noise source model for variable bit rate video servers , 2004, Comput. Commun..

[11]  Mette Rytgaard,et al.  Estimation in the Pareto Distribution , 1990, ASTIN Bulletin.

[12]  Joachim Charzinski,et al.  Evaluation of Effective Bandwidth Schemes for Self-Similar Traffic , 2000 .

[13]  Stefan A. Robila,et al.  A call for energy efficiency in data centers , 2014, SGMD.

[14]  Nicole Megow,et al.  Models and Algorithms for Stochastic Online Scheduling , 2006, Math. Oper. Res..

[15]  Anthony T. Chronopoulos,et al.  Algorithmic mechanism design for load balancing in distributed systems , 2002, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Walter Willinger,et al.  Self‐Similar Network Traffic: An Overview , 2002 .

[17]  Charles Loboz,et al.  Cloud Resource Usage—Heavy Tailed Distributions Invalidating Traditional Capacity Planning Models , 2012, Journal of Grid Computing.

[18]  Andrei Tchernykh,et al.  Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid , 2012, Journal of Grid Computing.

[19]  A. Leon-Garcia,et al.  Probability, statistics, and random processes for electrical engineering , 2008 .

[20]  R. Hilgers,et al.  Parameter , 2019, Springer Reference Medizin.

[21]  Mark A. McComb A Practical Guide to Heavy Tails , 2000, Technometrics.

[22]  Sidney I. Resnick,et al.  Heavy Tail Modelling and Teletraffic Data , 1995 .

[23]  L. T. DeCarlo On the meaning and use of kurtosis. , 1997 .

[24]  Tamara Radivilova,et al.  COMPARATIVE ANALYSIS FOR ESTIMATING OF THE HURST EXPONET FOR STATIONARY AND NONSTATIONARY TIME SERIES , 2011 .

[25]  Sathish S. Vadhiyar,et al.  A metascheduler for the Grid , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[26]  Dan Tsafrir,et al.  Backfilling Using System-Generated Predictions Rather than User Runtime Estimates , 2007, IEEE Transactions on Parallel and Distributed Systems.

[27]  Zsolt Németh,et al.  On Efficiency of Multi-job Grid Allocation Based on Statistical Trace Data , 2013, Journal of Grid Computing.

[28]  Alberto Luceño,et al.  Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators , 2006, Comput. Stat. Data Anal..

[29]  Emmanouel A. Varvarigos,et al.  Statistical Analysis and Modeling of Jobs in a Grid Environment , 2007, Journal of Grid Computing.

[30]  Saeed Jalili,et al.  Predicting Job Wait Time in Grid Environment by Applying Machine Learning Methods on Historical Information , 2012 .

[31]  Thilo Kielmann,et al.  Stochastic Tail-Phase Optimization for Bag-of-Tasks Execution in Clouds , 2012, 2012 IEEE Fifth International Conference on Utility and Cloud Computing.

[32]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[33]  Uwe Schwiegelshohn,et al.  Online scheduling in grids , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[34]  P. Sadayappan,et al.  Scheduling of Parallel Jobs in a Heterogeneous Multi-site Environement , 2003, JSSPP.

[35]  Sheng Di,et al.  Characterization and Comparison of Cloud versus Grid Workloads , 2012, 2012 IEEE International Conference on Cluster Computing.

[36]  Moshe Zukerman,et al.  Broadband traffic modeling: simple solutions to hard problems , 1998, IEEE Commun. Mag..

[37]  Walter Willinger,et al.  Long-range dependence in variable-bit-rate video traffic , 1995, IEEE Trans. Commun..

[38]  Ramin Yahyapour,et al.  Design and evaluation of job scheduling strategies for grid computing , 2000, GRID.

[39]  Andrei Tchernykh,et al.  A Grid simulation framework to study advance scheduling strategies for complex workflow applications , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[40]  Uwe Schwiegelshohn,et al.  Adaptive parallel job scheduling with resource admissible allocation on two-level hierarchical grids , 2012, Future Gener. Comput. Syst..

[41]  Achim Streit,et al.  Robust resource management for metacomputers , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[42]  Walter Willinger,et al.  Self-Similar Network Traffic and Performance Evaluation , 2000 .

[43]  Arutyun Avetisyan,et al.  Comparison of scheduling heuristics for grid resource broker , 2004, Proceedings of the Fifth Mexican International Conference in Computer Science, 2004. ENC 2004..

[44]  Armand M. Makowski,et al.  Tail probabilities for a multiplexer with self-similar traffic , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[45]  Sathish S. Vadhiyar,et al.  Prediction of Queue Waiting Times for Metascheduling on Parallel Batch Systems , 2014, JSSPP.

[46]  Jarek Nabrzyski,et al.  Dynamic grid scheduling with job migration and rescheduling in the GridLab resource management system , 2004, Sci. Program..

[47]  Tjark Vredeveld Stochastic online scheduling , 2011, Computer Science - Research and Development.

[48]  Yuri N. Sotskov,et al.  Book review Sequencing and Scheduling with Inaccurate Data , 2014 .

[49]  P. D. Coddington,et al.  Scheduling Independent Tasks on Metacomputing Systems , 1999 .

[50]  S. Resnick Heavy tail modeling and teletraffic data: special invited paper , 1997 .

[51]  Todd S. Munson,et al.  Optimality Measures for Performance Profiles , 2006, SIAM J. Optim..

[52]  Michael P. Cummings,et al.  Subdividing Long-Running, Variable-Length Analyses Into Short, Fixed-Length BOINC Workunits , 2015, Journal of Grid Computing.

[53]  Susanne Albers,et al.  Better bounds for online scheduling , 1997, STOC '97.

[54]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[55]  Ilkka Norros,et al.  A storage model with self-similar input , 1994, Queueing Syst. Theory Appl..

[56]  Dan Tsafrir,et al.  Experience with the Parallel Workloads Archive , 2012 .