Using Centralized I/O Scheduling Service (CISS) to Improve Cloud Object Storage Performance

Load imbalance reduces performance of cloud object storage with restrained system utilization and cost performance. The state of the art is to model load status by probing data nodes for decentralized scheduling (e.g. C3). This paper exploits the potential of centralized scheduling. We design a centralized I/O scheduling service (CISS), mainly concerning availability (high-performance and fault tolerance). For high-performance, it uses several techniques, delivering throughput of scheduling decision making with 3 million/s. The key is that load status can be jointly learned with scheduling to cut mutual overhead. First, it uses the basic scheduling operation unit (BSOU) to combine scheduling and learning. Second, scheduling requests are packed into BSOU stream. Third, scheduling decisions are computed in sequence at server-side. For fault tolerance, CISS is developed with a stateless primary-backup model. We implement a distributed object storage prototype to evaluate CISS. Experiments show that CISS can deliver similar utilization and performance to C3. Compared with C3, CISS is better at reducing tail latency (up to 37.8% reduction of maximum latency). Moreover, CISS can quickly eliminate performance fluctuation caused by the statelessness of fault tolerance strategy, which is quite acceptable.

[1]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[3]  Marco Canini,et al.  Rein: Taming Tail Latency in Key-Value Stores via Multiget Scheduling , 2017, EuroSys.

[4]  Toni Cortes,et al.  Automatic I/O Scheduler Selection through Online Workload Analysis , 2012, 2012 9th International Conference on Ubiquitous Intelligence and Computing and 9th International Conference on Autonomic and Trusted Computing.

[5]  Peter J. Varman,et al.  mClock: Handling Throughput Variability for Hypervisor IO Scheduling , 2010, OSDI.

[6]  Yang Liu,et al.  Automatic identification of application I/O signatures from noisy server-side traces , 2014, FAST.

[7]  Brighten Godfrey,et al.  Low latency via redundancy , 2013, CoNEXT.

[8]  Patrick Wendell,et al.  DONAR: decentralized server selection for cloud services , 2010, SIGCOMM '10.

[9]  Vyas Sekar,et al.  CFA: A Practical Prediction System for Video QoE Optimization , 2016, NSDI.

[10]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[11]  Anja Feldmann,et al.  C3: Cutting Tail Latency in Cloud Data Stores via Adaptive Replica Selection , 2015, NSDI.

[12]  M. N. Vora,et al.  Hadoop-HBase for large-scale data , 2011, Proceedings of 2011 International Conference on Computer Science and Network Technology.

[13]  Liu Yang,et al.  Server-Side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems , 2016 .

[14]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[15]  Shijie Sun,et al.  Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation , 2017, NSDI.