ReCA: An Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization

In recent years, Solid-State Drives (SSDs) have gained tremendous attention in computing and storage systems due to significant performance improvement over Hard Disk Drives (HDDs). The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks in order to reduce the number of accesses to HDD-based disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous OS-level caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive workload characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O workloads and classify them into five major classes. Based on this characterization, an optimal cache configuration is presented for each class of workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of workloads. The cache reconfiguration is done online and workload classes can be extended to emerging I/O workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a 4U rackmount server with SATA 6Gb/s disk interfaces running Linux 3.17.0 show that the proposed architecture improves performance and lifetime up to 24 and 33 percent, respectively.

[1]  Hossein Asadi,et al.  Operating system level data tiering using online workload characterization , 2015, The Journal of Supercomputing.

[2]  Steve Byan,et al.  Mercury: Host-side flash caching for the data center , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[3]  Onur Mutlu,et al.  ECI-Cache: A High-Endurance and Cost-Efficient I/O Caching Scheme for Virtualized Platforms , 2018, SIGMETRICS.

[4]  Andrew S. Tanenbaum,et al.  Integrating flash-based SSDs into the storage stack , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[5]  Michael M. Swift,et al.  FlashTier: a lightweight, consistent and durable storage cache , 2012, EuroSys '12.

[6]  Dan Feng,et al.  A Regional Popularity-Aware Cache replacement algorithm to improve the performance and lifetime of SSD-based disk cache , 2015, 2015 IEEE International Conference on Networking, Architecture and Storage (NAS).

[7]  Hong Jiang,et al.  HPDA: A hybrid parity-based disk array for enhanced performance and reliability , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[8]  Anand Sivasubramaniam,et al.  HybridStore: A Cost-Efficient, High-Performance Storage System Combining SSDs and HDDs , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.

[9]  Lingkun Wu,et al.  FSMAC: A file system metadata accelerator with non-volatile memory , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[10]  Fei Meng,et al.  vCacheShare: Automated Server Flash Cache Space Management in a Virtualization Environment , 2014, USENIX Annual Technical Conference.

[11]  Nimrod Megiddo,et al.  Outperforming LRU with an adaptive replacement cache algorithm , 2004, Computer.

[12]  Fang Wang,et al.  Improving RAID Performance Using an Endurable SSD Cache , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[13]  Pi-Cheng Hsiu,et al.  A Hybrid Storage Access Framework for High-Performance Virtual Machines , 2014, TECS.

[14]  WilkesJohn,et al.  The HP AutoRAID hierarchical storage system , 1996 .

[15]  Himabindu Pucha,et al.  Cost Effective Storage using Extent Based Dynamic Tiering , 2011, FAST.

[16]  Ming Zhao,et al.  Write policies for host-side flash caches , 2013, FAST.

[17]  A. L. Narasimha Reddy,et al.  Exploiting Concurrency to Improve Latency and throughput in a Hybrid Storage System , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[18]  Sang-Won Lee,et al.  SFS: random write considered harmful in solid state drives , 2012, FAST.

[19]  Evangelos Eleftheriou,et al.  Write amplification analysis in flash-based solid state drives , 2009, SYSTOR '09.

[20]  Tian Luo,et al.  Differentiated storage services , 2011, SOSP.

[21]  Michael M. Swift,et al.  Design and Prototype of a Solid-State Cache , 2014, TOS.

[22]  Antony I. T. Rowstron,et al.  Software-defined caching: managing caches in multi-tenant data centers , 2015, SoCC.

[23]  Dan Feng,et al.  Improving flash-based disk cache with Lazy Adaptive Replacement , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[24]  Donghee Lee,et al.  SSD caching to overcome small write problem of disk-based RAID in enterprise environments , 2015, SAC.

[25]  Hamid Sarbazi-Azad,et al.  DiskAccel: Accelerating Disk-Based Experiments by Representative Sampling , 2015, SIGMETRICS.

[26]  Mithuna Thottethodi,et al.  SieveStore: a highly-selective, ensemble-level disk cache for cost-performance , 2010, ISCA '10.

[27]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[28]  Guangwen Yang,et al.  Macss: A metadata-aware combo storage system , 2012, 2012 International Conference on Systems and Informatics (ICSAI2012).

[29]  David Hung-Chang Du,et al.  Large Block CLOCK (LB-CLOCK): A write caching algorithm for solid state disks , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[30]  Feng Chen,et al.  Hystor: making the best use of solid state drives in high performance storage systems , 2011, ICS '11.

[31]  Xiao Qin,et al.  WEC: Improving Durability of SSD Cache Drives by Caching Write-Efficient Data , 2015, IEEE Transactions on Computers.

[32]  Angelos Bilas,et al.  Azor: Using Two-Level Block Selection to Improve SSD-Based I/O Caches , 2011, 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage.

[33]  Herodotos Herodotou,et al.  OctopusFS: A Distributed File System with Tiered Storage Management , 2017, SIGMOD Conference.

[34]  Stephen Tweedie,et al.  Planned Extensions to the Linux Ext2/Ext3 Filesystem , 2002, USENIX Annual Technical Conference, FREENIX Track.

[35]  Vagelis Hristidis,et al.  BORG: Block-reORGanization for Self-optimizing Storage Systems , 2009, FAST.

[36]  Abhishek Chandra,et al.  TripS: automated multi-tiered data placement in a geo-distributed cloud environment , 2017, SYSTOR.

[37]  Tian Luo,et al.  hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems , 2012, Proc. VLDB Endow..

[38]  Toni Cortes,et al.  CRAID: online RAID upgrades using dynamic hot data reorganization , 2014, FAST.

[39]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[40]  Hamid Sarbazi-Azad,et al.  A Hybrid Non-Volatile Cache Design for Solid-State Drives Using Comprehensive I/O Characterization , 2016, IEEE Transactions on Computers.

[41]  Raju Rangaswami,et al.  Centaur: Host-Side SSD Caching for Storage Performance Control , 2015, 2015 IEEE International Conference on Autonomic Computing.

[42]  Hossein Asadi,et al.  Investigating power outage effects on reliability of solid-state drives , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[43]  Jason Liu,et al.  To ARC or Not to ARC , 2015, HotStorage.

[44]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[45]  Aamer Jaleel,et al.  Adaptive insertion policies for high performance caching , 2007, ISCA '07.

[46]  A. L. Narasimha Reddy,et al.  Managing storage space in a flash and disk hybrid storage system , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.