论文信息 - CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers

CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers

Large-scale data centers run latency-critical jobs with quality-of-service (QoS) requirements, and throughput-oriented background jobs, which need to achieve high perfor-mance. Previous works have proposed methods which cannot co-locate multiple latency-critical jobs with multiple back-grounds jobs while: (1) meeting the QoS requirements of all latency-critical jobs, and (2) maximizing the performance of the background jobs. This paper proposes CLITE, a Bayesian Optimization-based, multi-resource partitioning technique which achieves these goals. CLITE is publicly available at https://github.com/GoodwillComputingLab/CLITE.

Tirthak Patel | Devesh Tiwari | Tirthak Patel | Devesh Tiwari

[1] Thomas F. Wenisch,et al. The Queuing-First Approach for Tail Management of Interactive Services , 2019, IEEE Micro.

[2] Woongki Baek,et al. CoPart: Coordinated Partitioning of Last-Level Cache and Memory Bandwidth for Fairness-Aware Workload Consolidation on Commodity Servers , 2019, EuroSys.

[3] Mattan Erez,et al. Dirigent: Enforcing QoS for Latency-Critical Tasks on Shared Multicore Systems , 2016, ASPLOS.

[4] Yang Li,et al. dCat: dynamic cache management for efficient, performance-sensitive infrastructure-as-a-service , 2018, EuroSys.

[5] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[6] Sameh Elnikety,et al. PerfIso: Performance Isolation for Commercial Latency-Sensitive Services , 2018, USENIX Annual Technical Conference.

[7] Lingjia Tang,et al. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks , 2019, EuroSys.

[8] Brad Fitzpatrick,et al. Distributed caching with memcached , 2004 .

[9] Xiaodong Wang,et al. SWAP: Effective Fine-Grain Management of Shared Last-Level Caches with Minimum Hardware Support , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[10] No License,et al. Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .

[11] Thomas F. Wenisch,et al. SoftSKU: Optimizing Server Architectures for Microservice Diversity @Scale , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).

[12] Kirthevasan Kandasamy,et al. Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly , 2019, J. Mach. Learn. Res..

[13] Gang Wei,et al. Generalized non-convex non-smooth sparse and low rank minimization using proximal average , 2016, Neurocomputing.

[14] Leslie Pack Kaelbling,et al. Bayesian Optimization with Exponential Convergence , 2015, NIPS.

[15] Wei Zhou,et al. An extended fine-grained conflict detection method for shared-state scheduling in large scale cluster , 2016, ICIIP '16.

[16] Christina Delimitrou,et al. The Architectural Implications of Cloud Microservices , 2018, IEEE Computer Architecture Letters.

[17] Yuan He,et al. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems , 2019, ASPLOS.

[18] Mahmut T. Kandemir,et al. Phoenix: A Constraint-Aware Scheduler for Heterogeneous Datacenters , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[19] Daniel Sánchez,et al. Tailbench: a benchmark suite and evaluation methodology for latency-critical applications , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[20] Yingwei Luo,et al. DCAPS: dynamic cache allocation with partial sharing , 2018, EuroSys.

[21] Christian Bienia,et al. PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[22] Christina Delimitrou,et al. Pliant: Leveraging Approximation to Improve Datacenter Resource Efficiency , 2018, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[23] Anshul Gandhi,et al. Scavenger: A Black-Box Batch Workload Resource Manager for Improving Utilization in Cloud Environments , 2019, SoCC.

[24] Christoforos E. Kozyrakis,et al. Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[25] Boris Grot,et al. Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[26] Wei Wang,et al. ReQoS: reactive static/dynamic compilation for QoS in warehouse scale computers , 2013, ASPLOS '13.

[27] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[28] Chita R. Das,et al. D-factor: a quantitative model of application slow-down in multi-resource shared systems , 2012, SIGMETRICS '12.

[29] Lingjia Tang,et al. Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[30] Fabien Hermenier,et al. Multi-objective job placement in clusters , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[31] Christoforos E. Kozyrakis,et al. Vantage: Scalable and efficient fine-grain cache partitioning , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[32] Daniel Sánchez,et al. Rubik: Fast analytical power management for latency-critical systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33] Minyi Guo,et al. Avalon: towards QoS awareness and improved utilization through multi-resource management in datacenters , 2019, ICS.

[34] Xiao Zhang,et al. CPI2: CPU performance isolation for shared compute clusters , 2013, EuroSys '13.

[35] Cheng Li,et al. High Dimensional Bayesian Optimization using Dropout , 2018, IJCAI.

[36] Lizy Kurian John,et al. Predictive coordination of multiple on-chip resources for chip multiprocessors , 2011, ICS '11.

[37] Christine A. Shoemaker,et al. Flicker: a dynamically adaptive architecture for power limited multicore systems , 2013, ISCA.

[38] Tim Menzies,et al. Transfer Learning with Bellwethers to find Good Configurations , 2018, ArXiv.

[39] Mahmut T. Kandemir,et al. A case for integrated processor-cache partitioning in chip multiprocessors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[40] Lingjia Tang,et al. SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[41] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[42] Christina Delimitrou,et al. PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services , 2019, ASPLOS.

[43] Tim Menzies,et al. Scout: An Experienced Guide to Find the Best Cloud Configuration , 2018, ArXiv.

[44] Xiaosong Ma,et al. KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[45] Andrew V. Goldberg,et al. Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[46] Guilherme Ottoni,et al. Constrained Bayesian Optimization with Noisy Experiments , 2017, Bayesian Analysis.

[47] Lingjia Tang,et al. Compiling for niceness: mitigating contention for QoS in warehouse scale computers , 2012, CGO '12.

[48] Yuan He,et al. Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices , 2019, ASPLOS.

[49] Woongki Baek,et al. Hypart: a hybrid technique for practical memory bandwidth partitioning on commodity servers , 2018, PACT.

[50] Qi Luo,et al. Automating performance bottleneck detection using search-based application profiling , 2015, ISSTA.

[51] Tim Menzies,et al. Micky: A Cheaper Alternative for Selecting Cloud Instances , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[52] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[53] Kevin Skadron,et al. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[54] Christina Delimitrou,et al. QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[55] Praneeth Netrapalli,et al. Stochastic Gradient Descent and Its Variants in Machine Learning , 2019, Journal of the Indian Institute of Science.

[56] Ulf Leser,et al. Predictive performance modeling for distributed batch processing using black box monitoring and machine learning , 2018, Inf. Syst..

[57] Benjamin C. Lee,et al. Hound , 2018, PERV.

[58] Nobuyuki Shimizu,et al. Bayesian Optimization of HPC Systems for Energy Efficiency , 2018, ISC.

[59] Minlan Yu,et al. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.

[60] Alexandre Scotto Di Perrotolo. A Theoretical Framework for Bayesian Optimization Convergence , 2018 .

[61] M. Martonosi,et al. A Comparison of Capacity Management Schemes for Shared CMP Caches , 2008 .

[62] Mazin S. Yousif,et al. Microservices , 2016, IEEE Cloud Comput..

[63] Benjamin C. Lee,et al. Cooper: Task Colocation with Cooperative Games , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[64] Minyi Guo,et al. Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters , 2019, ICS.

[65] Farhad Azadivar,et al. Simulation optimization methodologies , 1999, WSC '99.

[66] Christina Delimitrou,et al. Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[67] Daniel Sánchez,et al. Ubik: efficient cache sharing with strict qos for latency-critical workloads , 2014, ASPLOS.

[68] Robert L. Mason,et al. Fractional factorial design , 2009 .