CLIBE: Precise Cluster-Level I/O Bandwidth Enforcement in Distributed File System

A distributed file system (DFS) is a core component to implement big data applications. On the one hand, a DFS is capable of managing a large volume of data with desirable properties that strike the balance between high availability, reliability, and so on. On the other hand, a DFS relies on underlying storage systems (e.g., hard drives, solid state drives, etc.) and suffer from slow read/write operations. In big data era, large-scale data processing applications start to leverage the in-memory processing to improve the performance by reducing the inhibitive cost of I/O operations. However, it is still inevitable to read input data from or write outputs to the storage system. Slow I/O operations are often the main bottleneck of emerging big data applications. In particular, while these applications often use DFSs to store their results for the high availability and reliability, the unmanaged I/O bandwidth contention results in the QoS violation of high priority applications when multiple applications share the same DFS. To enable I/O management and allocation on big-data platforms, we propose a Cluster-Level I/O Bandwidth Enforcement (CLIBE) approach that consists of a cluster-level I/O bandwidth quota manager, multiple node-level I/O bandwidth controllers, and a feedback-based quota reallocator. The quota manager splits and distributes the I/O bandwidth quota of an application to the active nodes that are serving this application. The bandwidth controller on a node ensures that the I/O bandwidth used by an application would not exceed its bandwidth quota on the node. For an application affected by slow or overloaded nodes, the quota reallocator reallocates the idle I/O bandwidth on underloaded nodes to this application to guarantee its throughput. Our experiment on a real-system cluster shows that CLIBE is able to precisely control the I/O bandwidth used by an application at the cluster level, with the deviation smaller than 2.51%.

[1]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[2]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[3]  GhemawatSanjay,et al.  The Google file system , 2003 .

[4]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[5]  Xiaobo Zhou,et al.  iShuffle: Improving Hadoop Performance with Shuffle-on-Write , 2017, IEEE Transactions on Parallel and Distributed Systems.

[6]  Raju Rangaswami,et al.  I/O Deduplication: Utilizing content similarity to improve I/O performance , 2010, TOS.

[7]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[8]  Tien Van Do,et al.  Provision of Disk I/O Guarantee for MapReduce Applications , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[9]  Saneyasu Yamaguchi,et al.  File Placing Control for Improving the I/O Performance of Hadoop in Virtualized Environment , 2016, 2016 Fourth International Symposium on Computing and Networking (CANDAR).

[10]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[11]  HuaShen Shi,et al.  Mitigate I/O access pattern divergence with heterogeneous architecture in HDFS , 2015, 2015 4th International Conference on Computer Science and Network Technology (ICCSNT).

[12]  Saneyasu Yamaguchi,et al.  Improving the I/O Performance in the Reduce Phase of Hadoop , 2015, 2015 Third International Symposium on Computing and Networking (CANDAR).

[13]  Scott A. Brandt,et al.  Horizon: efficient deadline-driven disk I/O management for distributed storage systems , 2010, HPDC '10.

[14]  Yuan Zhou,et al.  Adding network bandwidth resource management to Hadoop YARN , 2017, 2017 Seventh International Conference on Information Science and Technology (ICIST).