A Self-Learning Scheduling in Cloud Software Defined Block Storage

Software Defined Storage (SDS) separates the control layer from the data layer allowing the automation of data management and deployment of Commercial Off-The-Shelf (COTS) storage media rather than expensive traditional hardware-based solutions. Cloud block storage services lack an SDS framework that allows customization of block storage, policy enforcement, automate provisioning, and storage management. SDS decreases the human intervention and improves the resource utilization. Moreover, SDS allows cloud tenants to define customized functionalities based on their needs with guaranteed performance and high availability that meets Service Level Agreements (SLAs). However, maintaining SLAs requirements in cloud block storage is challenging due to the storage cluster features, the workload interference, the workload characteristics and other indirect related latent variables. To address the mentioned issues, cloud providers often over-provision the storage resources. Moving towards SDS, we initiate a framework for cloud block storage as an active storage system. Our framework provides customization of block storage services and optimized scheduling decisions based on the workload characteristics and performance of the underlying data layer leveraging a self-learning scheduler. The proposed scheduler treats the storage backend nodes as a black box and requires zero knowledge of their internal states. We showcase a practical application of the proposed scheduler in our private OpenStack deployment.

[1]  Rean Griffith,et al.  Serifos: Workload Consolidation and Load Balancing for SSD Based Cloud Storage Systems , 2015, ArXiv.

[2]  Shinpei Kato,et al.  QBox: guaranteeing I/O performance on black box storage systems , 2012, HPDC '12.

[3]  Robert D. Callaway,et al.  SLA-aware resource scheduling for cloud storage , 2014, 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet).

[4]  Randy H. Katz,et al.  An Empirical Exploration of Black-Box Performance Models for Storage Systems , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[5]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[6]  Robert D. Callaway,et al.  Multi-dimensional scheduling in cloud storage systems , 2015, 2015 IEEE International Conference on Communications (ICC).

[7]  Marc Sánchez Artigas,et al.  Vertigo: Programmable Micro-controllers for Software-Defined Object Storage , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).

[8]  Yu Zhang,et al.  Self-Learning Disk Scheduling , 2009, IEEE Transactions on Knowledge and Data Engineering.

[9]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[10]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[11]  Ioannis Papapanagiotou,et al.  A trace-driven evaluation of cloud computing schedulers for IaaS , 2017, 2017 IEEE International Conference on Communications (ICC).

[12]  Jeffrey Shafer,et al.  I/O virtualization bottlenecks in cloud computing today , 2010 .

[13]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[14]  Baijian Yang,et al.  A Black-Box Self-Learning Scheduler for Cloud Block Storage Systems , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).