Elastic Job Scheduling with Unknown Utility Functions

We consider a bipartite network consisting of job schedulers and parallel servers. Jobs arrive at the schedulers following stochastic processes with unknown arrival rates, and get routed to the servers, which execute the jobs with unknown service rates. The jobs are elastic, as their “size”, i.e., the amount of service needed for their completion, is determined by the schedulers. After a job finishes execution, some utility is obtained where the utility value depends on the job’s size through some underlying concave utility function. We consider the setting where the utility functions are unknown apriori, while a noisy observation of the utility value of each job is obtained upon its completion. Our goal is to design a policy that makes job-size and routing decisions to maximize the total utility obtained by the end of the time horizon T . We measure the performance of a policy by regret, i.e., the gap between the expected utility obtained under the policy and that under the optimal policy. We first establish an upper bound on the regret of a generic policy, that consists of the cumulative difference in utility between the jobsize decisions of the policy and the solution to a static optimization problem, and the total backlog of unfinished jobs at the end of the time horizon.We then propose a policy that simultaneously controls the cumulative utility difference and backlog of unfinished jobs, and achieves an order optimal regret of ?̃? ( √ T ). Our policy solves the elastic job scheduling problem by extending the Stochastic Convex Bandit Algorithm to handle unknown and stochastic constraints, and making routing decisions based on the Join-the-Shortest-Queue rule. It also presents a principled approach to extending algorithms for zeroth-order convex optimization to the settings with unknown and stochastic constraints.

[1]  R. Srikant,et al.  Scheduling Jobs With Unknown Duration in Clouds , 2013, IEEE/ACM Transactions on Networking.

[2]  Nikhil R. Devanur,et al.  Bandits with concave rewards and convex knapsacks , 2014, EC.

[3]  Xiaohan Wei,et al.  Online Convex Optimization with Stochastic Constraints , 2017, NIPS.

[4]  Benjamin Van Roy,et al.  Model-based Reinforcement Learning and the Eluder Dimension , 2014, NIPS.

[5]  Sanjay Shakkottai,et al.  Learning Unknown Service Rates in Queues: A Multiarmed Bandit Approach , 2021, Oper. Res..

[6]  Ness B. Shroff,et al.  Forget the Deadline: Scheduling Interactive Applications in Data Centers , 2015, 2015 IEEE 8th International Conference on Cloud Computing.

[7]  Chuan Wu,et al.  Optimus: an efficient dynamic resource scheduler for deep learning clusters , 2018, EuroSys.

[8]  Sivaraman Balakrishnan,et al.  Stochastic Zeroth-order Optimization in High Dimensions , 2017, AISTATS.

[9]  William J. Cook,et al.  A Computational Study of the Job-Shop Scheduling Problem , 1991, INFORMS Journal on Computing.

[10]  Wencong Xiao,et al.  Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads , 2019, USENIX Annual Technical Conference.

[11]  Georgios B. Giannakis,et al.  Bandit Convex Optimization for Scalable and Dynamic IoT Management , 2017, IEEE Internet of Things Journal.

[12]  Ness B. Shroff,et al.  Utility maximization for communication networks with multipath routing , 2006, IEEE Transactions on Automatic Control.

[13]  Ohad Shamir,et al.  On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization , 2012, COLT.

[14]  Benjamin Van Roy,et al.  (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.

[15]  Michael I. Jordan,et al.  Is Q-learning Provably Efficient? , 2018, NeurIPS.

[16]  Daniel Pérez Palomar,et al.  Alternative Distributed Algorithms for Network Utility Maximization: Framework and Applications , 2007, IEEE Transactions on Automatic Control.

[17]  Ness B. Shroff,et al.  Online multi-resource allocation for deadline sensitive jobs with partial values in the cloud , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[18]  Peter Auer,et al.  Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..

[19]  Flávio Sanson Fogliatto,et al.  Learning curve models and applications: Literature review and research directions , 2011 .

[20]  Michael J. Freedman,et al.  Resource Elasticity in Distributed Deep Learning , 2020, MLSys.

[21]  Eytan Modiano,et al.  Learning-NUM: Network Utility Maximization With Unknown Utility Functions and Queueing Delay , 2020, IEEE/ACM Transactions on Networking.

[22]  Rayadurgam Srikant,et al.  Scheduling Jobs With Unknown Duration in Clouds , 2014, IEEE/ACM Trans. Netw..

[23]  Nikhil R. Devanur,et al.  Linear Contextual Bandits with Knapsacks , 2015, NIPS.

[24]  Sham M. Kakade,et al.  Stochastic Convex Optimization with Bandit Feedback , 2011, SIAM J. Optim..

[25]  R. Srikant,et al.  Heavy traffic optimal resource allocation algorithms for cloud computing clusters , 2012, 2012 24th International Teletraffic Congress (ITC 24).

[26]  Eytan Modiano,et al.  Learning Algorithms for Minimizing Queue Length Regret , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[27]  Aaron Klein,et al.  Learning Curve Prediction with Bayesian Neural Networks , 2016, ICLR.

[28]  Giuseppe Lipari,et al.  Elastic Scheduling for Flexible Workload Management , 2002, IEEE Trans. Computers.

[29]  Mehryar Mohri,et al.  Optimistic Bandit Convex Optimization , 2016, NIPS.

[30]  Kavosh Asadi,et al.  Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.

[31]  Eytan Modiano,et al.  Dynamic power allocation and routing for time varying wireless networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[32]  Elad Hazan,et al.  Bandit Convex Optimization: Towards Tight Bounds , 2014, NIPS.

[33]  T.C.E. Cheng,et al.  A state-of-the-art review of parallel-machine scheduling research , 1990 .

[34]  Steven H. Low,et al.  Optimization flow control—I: basic algorithm and convergence , 1999, TNET.

[35]  Zongpeng Li,et al.  Online Job Scheduling in Distributed Machine Learning Clusters , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[36]  Mor Harchol-Balter,et al.  Towards Optimality in Parallel Job Scheduling , 2017, SIGMETRICS.

[37]  R. Srikant,et al.  Stochastic models of load balancing and scheduling in cloud computing clusters , 2012, 2012 Proceedings IEEE INFOCOM.

[38]  Adam Wierman,et al.  Scheduling despite inexact job-size information , 2008, SIGMETRICS '08.

[39]  K. J. Ray Liu,et al.  Online Convex Optimization With Time-Varying Constraints and Bandit Feedback , 2019, IEEE Transactions on Automatic Control.

[40]  Gregory R. Ganger,et al.  Proteus: agile ML elasticity through tiered reliability in dynamic resource markets , 2017, EuroSys.

[41]  Uwe Schwiegelshohn,et al.  Theory and Practice in Parallel Job Scheduling , 1997, JSSPP.

[42]  Sanjay Shakkottai,et al.  Regret of Queueing Bandits , 2016, NIPS.

[43]  Michael J. Freedman,et al.  SLAQ: quality-driven scheduling for distributed machine learning , 2017, SoCC.