论文信息 - Quantifying the Brown Side of Priority Schedulers: Lessons from Big Clusters

Quantifying the Brown Side of Priority Schedulers: Lessons from Big Clusters

Scheduling is a central operation to achieve "green" data centers, i.e., distributing diversified workloads across heterogeneous resources in an energy efficient manner. Taking an opposite perspective from most of the related work, this paper reveals the "brown" side of scheduling, i.e., wasted core seconds (so called brown resources), using fleld analysis and trace-driven simulation of a Google cluster trace. First, based on the trace, we pinpoint the dependency between priority scheduling and task eviction that causes brown resources and present a brief characterization study focusing on workload priorities. Next, to better understand and further reduce the resource "inefficiency" of priority scheduling, we develop a slot-based scheduler and simulator with various system tunable parameters. Our key finding is that tasks of low priority suffer greatly in terms of response time as well as CPU resources because of a high probability of being evicted and resubmitted. We propose to use simple threshold-based policies that consider the trade-off between task drop rates and wasted core seconds due to task resubmission due to eviction. Our experimental results show that we are able to effectively mitigate brown resources without sacrificing the performance advantages of priority scheduling.

Andrea Rosà | Walter Binder | Lydia Y. Chen | Fatih Alagöz | Derya Çavdar

[1] Joseph L. Hellerstein,et al. Obfuscatory obscanturism: Making workload traces of commercially-sensitive systems safe to release , 2012, 2012 IEEE Network Operations and Management Symposium.

[2] Cristina L. Abad,et al. Natjam: design and evaluation of eviction policies for supporting priorities and deadlines in mapreduce clusters , 2013, SoCC.

[3] Adam Wierman,et al. Renewable and cooling aware workload management for sustainable data centers , 2012, SIGMETRICS '12.

[4] Randy H. Katz,et al. Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[5] Luiz André Barroso,et al. The tail at scale , 2013, CACM.

[6] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[7] Franck Cappello,et al. Characterizing Cloud Applications on a Google Data Center , 2013, 2013 42nd International Conference on Parallel Processing.