Achieving Fairness-Aware Two-Level Scheduling for Heterogeneous Distributed Systems

In a heterogeneous distributed system composed of various types of computing platforms such as supercomputers, grids, and clouds, a two-level scheduling approach can be used to effectively distribute resources of the platforms to users in the first-level, and map tasks of the users in nodes for each platform in the second-level for executing many-task applications. When scheduling heterogeneous resources, service providers of the system should consider the fairness among multiple users as well as the system efficiency. However, the fairness cannot be achieved by simply distributing an equal amount of resources from each platform to every user. In this paper, we investigate how to address the fairness issue among multiple users in a heterogeneous distributed system. We present three first-level resource allocation policies of a provider affinity first policy, an application affinity first policy, and a platform affinity based round-robin policy, and two second-level task mapping policies of a most affected first policy and a co-runner affinity based round-robin policy. Using trace-based simulations, we evaluate the performance of various combinations of the first and second level scheduling policies. Our extensive simulation results demonstrate that the first-level policy plays a crucial role to achieve relatively good fairness.

[1]  Zhou Lei,et al.  The portable batch scheduler and the maui scheduler on linux clusters , 2000 .

[2]  Yu Zhang,et al.  An Application-Level Scheduling with Task Bundling Approach for Many-Task Computing in Heterogeneous Environments , 2012, NPC.

[3]  Jaehyuk Huh,et al.  Interference Management for Distributed Parallel Applications in Consolidated Clusters , 2016, ASPLOS.

[4]  Ewa Deelman,et al.  Producing an Infrared Multiwavelength Galactic Plane Atlas Using Montage, Pegasus, and Amazon Web Services , 2014 .

[5]  Soonwook Hwang,et al.  Platform and Co-Runner Affinities for Many-Task Applications in Distributed Computing Platforms , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[6]  Miron Livny,et al.  Mechanisms for High Throughput Computing , 1997 .

[7]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[8]  Geoffrey C. Fox,et al.  Applying Twister to Scientific Applications , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[9]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[10]  Justin M. Wozniak,et al.  Coasters: Uniform Resource Provisioning and Access for Clouds and Grids , 2011, 2011 Fourth IEEE International Conference on Utility and Cloud Computing.

[11]  Nael B. Abu-Ghazaleh,et al.  Controlled Contention: Balancing Contention and Reservation in Multicore Application Scheduling , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[12]  Bu-Sung Lee,et al.  Fair Resource Allocation for Data-Intensive Computing in the Cloud , 2018, IEEE Transactions on Services Computing.

[13]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[14]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[15]  Raj Jain,et al.  A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems , 1998, ArXiv.

[16]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[17]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[18]  Lieven Eeckhout,et al.  Fairness-aware scheduling on single-ISA heterogeneous multi-cores , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[19]  R. F. Freund,et al.  Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[20]  Daniel S. Katz,et al.  MTC envelope: defining the capability of large scale computers in the context of parallel scripting applications , 2013, HPDC.

[21]  Ada Gavrilovska,et al.  Merlin: Application- and Platform-aware Resource Allocation in Consolidated Server Systems , 2014, SoCC.

[22]  Jean-Pierre A. Kocher,et al.  Multilevel Parallelization of AutoDock 4.2 , 2011, J. Cheminformatics.

[23]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[24]  Soonwook Hwang,et al.  A Comparative Analysis of Scheduling Mechanisms for Virtual Screening Workflow in a Shared Resource Environment , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[25]  Seung Ryoul Maeng,et al.  Virtualizing performance asymmetric multi-core systems , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[26]  Soonwook Hwang,et al.  Towards effective science cloud provisioning for a large-scale high-throughput computing , 2014, Cluster Computing.

[27]  Soonwook Hwang,et al.  Resource Allocation Policies for Loosely Coupled Applications in Heterogeneous Computing Systems , 2016, IEEE Transactions on Parallel and Distributed Systems.

[28]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[29]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[30]  Prashant J. Shenoy,et al.  Resource overbooking and application profiling in a shared Internet hosting platform , 2009, TOIT.

[31]  Lingjia Tang,et al.  Whare-map: heterogeneity in "homogeneous" warehouse-scale computers , 2013, ISCA.

[32]  Unai Arronategui,et al.  Fair scheduling of bag-of-tasks applications on large-scale platforms , 2015, Future Gener. Comput. Syst..

[33]  Seoyoung Kim,et al.  HTCaaS : Leveraging Distributed Supercomputing Infrastructures for Large-Scale Scientific Computing , 2013 .

[34]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[35]  Hyun-Chul Kim,et al.  ϕ photoproduction with coupled-channel effects , 2012, 1212.6075.

[36]  Soonwook Hwang,et al.  On the role of application and resource characterizations in heterogeneous distributed computing systems , 2016, Cluster Computing.

[37]  Ming Zhao,et al.  IBIS: Interposed Big-data I/O Scheduler , 2016, HPDC.

[38]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[39]  Ehsan Ullah Munir,et al.  SDBATS: A Novel Algorithm for Task Scheduling in Heterogeneous Computing Systems , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[40]  Benjamin C. Lee,et al.  Cooper: Task Colocation with Cooperative Games , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).