Optimal Disk Storage Allocation for Multitier Storage System

The current storage system is facing the bottleneck of performance due to the gap between fast CPU computing speed and the slow response time of hard disk. Recently a multitier hybrid storage system (MTHS) which uses fast flash devices like a solid-state drive (SSD) as the one of the high performance storage tiers has been proposed to boost the storage system performance. In order to maintain the overall performance of the MTHS, optimal disk storage assignment has to be designed so that the data migrated to the high performance tier like SSD is the optimal set of data. In this paper we proposed a optimal data allocation algorithm for disk storage in MTHS. The data allocation problem (DAP) is to find the optimal lists of data files for each storage tier in the MTHS to achieve maximal benefit values without exceeding the available size of each tier. We formulate the DAP as a special multiple choice knapsack problem (MCKP) and propose the multiple-stage dynamic programming (MDP) to find the optimal solutions. The results show that the MDP can achieve improvements up to 6 times compared with the existing greedy algorithms.

[1]  David Pisinger,et al.  Core Problems in Knapsack Algorithms , 1999, Oper. Res..

[2]  Koustuv Dasgupta,et al.  Compass: optimizing the migration cost vs. application performance tradeoff , 2008, IEEE Transactions on Network and Service Management.

[3]  Steve Vandebogart,et al.  Reducing Seek Overhead with Application-Directed Prefetching , 2009, USENIX Annual Technical Conference.

[4]  Kang G. Shin,et al.  FS2: dynamic data replication in free disk space for improving disk performance and energy consumption , 2005, SOSP '05.

[5]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[6]  Scott A. Brandt,et al.  A Hybrid Disk-Aware Spin-Down Algorithm with I/O Subsystem Support , 2007, 2007 IEEE International Performance, Computing, and Communications Conference.

[7]  Xiaoning Ding,et al.  DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch , 2007, USENIX Annual Technical Conference.

[8]  Ronald D. Armstrong,et al.  The Multiple-Choice Nested Knapsack Model , 1982 .

[9]  Radu Prodan,et al.  Bi-Criteria Scheduling of Scientific Grid Workflows , 2010, IEEE Transactions on Automation Science and Engineering.

[10]  C. Kirsch Combo Drive : Optimizing Cost and Performance in a Heterogeneous Storage Device , 2009 .

[11]  David Vengerov,et al.  A reinforcement learning framework for online data migration in hierarchical storage systems , 2007, The Journal of Supercomputing.

[12]  Yifeng Zhu,et al.  Hot Random Off-Loading: A Hybrid Storage System with Dynamic Data Migration , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.

[13]  Chengbin Chu,et al.  A new dynamic programming method for reliability & redundancy allocation in a parallel-series system , 2005, IEEE Transactions on Reliability.

[14]  Vagelis Hristidis,et al.  BORG: Block-reORGanization for Self-optimizing Storage Systems , 2009, FAST.

[15]  David Pisinger A minimal algorithm for the Multiple-choice Knapsack Problem , 1995 .

[16]  Chenyang Lu,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Aqueduct: Online Data Migration with Performance Guarantees , 2022 .