AlloX: Allocation across Computing Resources for Hybrid CPU/GPU clusters
暂无分享,去创建一个
GPUs are considered as the accelerators for CPUs. We call these applications GPU applications. Some machine learning frameworks like Tensorflow support their machine learning (ML) jobs running either on CPUs or GPUs. Nvidia claims that Titan GPU K80 12GB can speed up 5-10x on average. Although GPUs offer the advantages on performance, they are very expensive. For example, a GPU K80 roughly costs $4000 while an Intel Xeon E5 Quad cores costs $350.
The coexist of traditional CPU and GPU applications urges cloud computing operators to build hybrid CPU/GPU clusters. While the traditional applications are executed on CPUs, the GPU applications can run on either CPUs or GPUs. In the CPU/GPU clusters, how to provision the hybrid CPU/GPU clusters for CPU and GPU applications and how to allocate the resources across CPUs and GPUs?
Interchangeable resources like CPUs and GPUs are not rare in large clusters. Some network I/O cards like wireless, ethernet, infinityband with different bandwidths can also be interchangeable.
In this paper, we focus on CPU/GPU systems. We develop a tool that estimates the performance and resource for an ML job in an online manner (§2). We implement AlloX system that supports resource allocation and places applications on right resources (CPU or GPU) to maximize the use of computational resource (§3). The proposed AlloX policy achieves up to 35% progress improvement compared to default DRF [2]. We build a model that minimizes the total cost of ownership for CPU/GPU data centers (§4).
[1] Ion Stoica,et al. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.
[2] Zhenhua Liu,et al. Joint capacity planning and operational management for sustainable data centers and demand response , 2016, e-Energy.
[3] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.
[4] Minlan Yu,et al. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.