The X-flex cross-platform scheduler: who's the fairest of them all?

We introduce the X-Flex cross-platform scheduler. X-Flex is intended as an alternative to the Dominant Resource Fairness (DRF) scheduler currently employed by both YARN and Mesos. There are multiple design differences between X-Flex and DRF. For one thing, DRF is based on an instantaneous notion of fairness, while X-Flex monitors instantaneous fairness in order to take a long-term view. The definition of instantaneous fairness itself is different among the two schedulers. Furthermore, the packing of containers into processing nodes in DRF is done online, while in X-Flex it is performed offline in order to improve packing quality. Finally, DRF is essentially an extension to multiple dimensions of the Fair MapReduce scheduler. As such it makes scheduling decisions at a very low level. X-Flex, on the other hand, takes the perspective that some frameworks have sufficient structure to make higher level scheduling decisions. So X-Flex allows this, and also gives platforms a great deal of autonomy over the degree of sharing they will permit with other platforms. We describe the technical details of X-Flex and provide experiments to show its excellent performance.

[1]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[2]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[3]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[4]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[5]  Miron Livny,et al.  Distributed computing in practice: the Condor experience: Research Articles , 2005 .

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[8]  Ariel D. Procaccia,et al.  Beyond Dominant Resource Fairness , 2015, ACM Trans. Economics and Comput..

[9]  Garrick Staples,et al.  TORQUE resource manager , 2006, SC.

[10]  Samir Khuller,et al.  New Approximation Results for Resource Replication Problems , 2012, Algorithmica.

[11]  Andrey Balmin,et al.  Visualizing jobs with shared resources in distributed environments , 2013, 2013 First IEEE Working Conference on Software Visualization (VISSOFT).

[12]  Nathan Linial,et al.  No justified complaints: on fair sharing of multiple resources , 2011, ITCS '12.

[13]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[14]  Maciej Drozdowski,et al.  Scheduling for Parallel Processing , 2009, Computer Communications and Networks.

[15]  Yossi Azar,et al.  Tight bounds for online vector bin packing , 2013, STOC '13.

[16]  Joan Boyar,et al.  The Accommodating Function: A Generalization of the Competitive Ratio , 2001, SIAM J. Comput..

[17]  Eric Bouillet,et al.  The best of two worlds: Integrating IBM InfoSphere Streams with Apache YARN , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[18]  Sanjeev Khanna,et al.  A PTAS for the multiple knapsack problem , 2000, SODA '00.

[19]  Reuven Cohen,et al.  An efficient approximation for the Generalized Assignment Problem , 2006, Inf. Process. Lett..

[20]  Srikanth Kandula,et al.  Multi-resource packing for cluster schedulers , 2014, SIGCOMM.

[21]  Samir Khuller,et al.  Analyzing the Optimal Neighborhood: Algorithms for Budgeted and Partial Connected Dominating Set Problems , 2013, SODA.

[22]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[23]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[24]  Kun-Lung Wu,et al.  FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads , 2010, Middleware.

[25]  Kun-Lung Wu,et al.  On the optimization of schedules for MapReduce workloads in the presence of shared scans , 2012, The VLDB Journal.

[26]  Kun-Lung Wu,et al.  SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems , 2008, Middleware.

[27]  Andrey Balmin,et al.  FlowFlex: Malleable Scheduling for Flows of MapReduce Jobs , 2013, Middleware.

[28]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.