SSD-based Workload Characteristics and Their Performance Implications

Storage systems are designed and optimized relying on wisdom derived from analysis studies of file-system and block-level workloads. However, while SSDs are becoming a dominant building block in many storage systems, their design continues to build on knowledge derived from analysis targeted at hard disk optimization. Though still valuable, it does not cover important aspects relevant for SSD performance. In a sense, we are “searching under the streetlight,” possibly missing important opportunities for optimizing storage system design. We present the first I/O workload analysis designed with SSDs in mind. We characterize traces from four repositories and examine their “temperature” ranges, sensitivity to page size, and “logical locality.” We then take the first step towards correlating these characteristics with three standard performance metrics: write amplification, read amplification, and flash read costs. Our results show that SSD-specific characteristics strongly affect performance, often in surprising ways.

[1]  Andrea C. Arpaci-Dusseau,et al.  Getting real: lessons in transitioning research simulations into hardware systems , 2013, FAST.

[2]  Improving MLC flash performance and endurance with extended P/E cycles , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[3]  Feng Wang,et al.  File System Workload Analysis For Large Scale Scientific Com puting Applications , 2004 .

[4]  H. Howie Huang,et al.  Graphene: Fine-Grained IO Management for Graph Computing , 2017, FAST.

[5]  Irfan Ahmad,et al.  Cache Modeling and Optimization using Miniature Simulations , 2017, USENIX Annual Technical Conference.

[6]  Sangyeun Cho,et al.  The Multi-streamed Solid-State Drive , 2014, HotStorage.

[7]  Javier González,et al.  LightNVM: The Linux Open-Channel SSD Subsystem , 2017, FAST.

[8]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[9]  Ying Yu,et al.  11.1 A 512Gb 3b/cell flash memory on 64-word-line-layer BiCS technology , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[10]  Joo Young Hwang,et al.  FStream: Managing Flash Streams in the File System , 2018, FAST.

[11]  Matias Bjørling,et al.  The CASE of FEMU: Cheap, Accurate, Scalable and Extensible Flash Emulator , 2018, FAST.

[12]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[13]  Andrea C. Arpaci-Dusseau,et al.  Towards an Unwritten Contract of Intel Optane SSD , 2019, HotStorage.

[14]  David Hung-Chang Du,et al.  SMaRT: An Approach to Shingled Magnetic Recording Translation , 2017, FAST.

[15]  Hao Wen,et al.  TrackLace: Data Management for Interlaced Magnetic Recording , 2020 .

[16]  Ajay Gulati,et al.  Storage Workload Characterization and Consolidation in Virtualized Environments , 2008 .

[17]  Eitan Yaakobi,et al.  Write Once, Get 50% Free: Saving SSD Erase Costs Using WOM Codes , 2015, FAST.

[18]  David Hung-Chang Du,et al.  Hot data identification for flash-based storage systems using multiple bloom filters , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[19]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[20]  Peter Desnoyers,et al.  Analytic Models of SSD Write Performance , 2014, TOS.

[21]  Sang-Won Lee,et al.  SFS: random write considered harmful in solid state drives , 2012, FAST.

[22]  Jie Zhang,et al.  Scalable Parallel Flash Firmware for Many-core Architectures , 2020, FAST.

[23]  Da-Wei Chang,et al.  ROSE: A Novel Flash Translation Layer for NAND Flash Memory Based on Hybrid Address Translation , 2011, IEEE Transactions on Computers.

[24]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[25]  Jongmoo Choi,et al.  Improving SSD reliability with RAID via Elastic Striping and Anywhere Parity , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[26]  Gala Yadgar,et al.  Avoiding the Streetlight Effect: I/O Workload Analysis with SSDs in Mind , 2016, HotStorage.

[27]  Gregory R. Ganger,et al.  The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) , 1998 .

[28]  Mahesh Balakrishnan,et al.  Extending SSD Lifetimes with Disk-Based Write Caches , 2010, FAST.

[29]  Dongkun Shin,et al.  ComboFTL: Improving performance and lifespan of MLC flash memory using SLC flash buffer , 2010, J. Syst. Archit..

[30]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[31]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[32]  Jacob R. Lorch,et al.  A five-year study of file-system metadata , 2007, TOS.

[33]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[34]  Eitan Yaakobi,et al.  It's Not Where Your Data Is, It's How It Got There , 2015, HotStorage.

[35]  Elham Cheshmikhani,et al.  STAIR: High Reliable STT-MRAM Aware Multi-Level I/O Cache Architecture by Adaptive ECC Allocation , 2020, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[36]  Cheng Li,et al.  Assert(!Defined(Sequential I/O)) , 2014, HotStorage.

[37]  Alma Riska,et al.  Disk Drive Level Workload Characterization , 2006, USENIX Annual Technical Conference, General Track.

[38]  Tei-Wei Kuo,et al.  Efficient identification of hot data for flash memory storage systems , 2006, TOS.

[39]  Shankar Pasupathy,et al.  Measurement and Analysis of Large-Scale Network File System Workloads , 2008, USENIX Annual Technical Conference.

[40]  Raju Rangaswami,et al.  I/O Deduplication: Utilizing content similarity to improve I/O performance , 2010, TOS.

[41]  Andrea C. Arpaci-Dusseau,et al.  The Unwritten Contract of Solid State Drives , 2017, EuroSys.

[42]  André Brinkmann,et al.  FADaC: a self-adapting data classifier for flash memory , 2019, SYSTOR.

[43]  Peter Desnoyers,et al.  What Systems Researchers Need to Know about NAND Flash , 2013, HotStorage.

[44]  Mahmut T. Kandemir,et al.  PEN: Design and Evaluation of Partial-Erase for 3D NAND-Based High Density SSDs , 2018, FAST.

[45]  Thomas F. Wenisch,et al.  Thermostat: Application-transparent Page Management for Two-tiered Main Memory , 2017, ASPLOS.

[46]  Feng Chen,et al.  Understanding storage I/O behaviors of mobile applications , 2016, 2016 32nd Symposium on Mass Storage Systems and Technologies (MSST).

[47]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[48]  Yan Li,et al.  128Gb 3b/cell NAND flash memory in 19nm technology with 18MB/s write rate and 400Mb/s toggle mode , 2012, 2012 IEEE International Solid-State Circuits Conference.

[49]  Ni Xue,et al.  Reducing Garbage Collection Overhead in SSD Based on Workload Prediction , 2019, HotStorage.

[50]  Anastasia Ailamaki,et al.  Improving Flash Write Performance by Using Update Frequency , 2013, Proc. VLDB Endow..

[51]  Ruei-Chuan Chang,et al.  Using data clustering to improve cleaning performance for flash memory , 1999 .

[52]  Fabio Margaglia,et al.  Extending SSD lifetime in database applications with page overwrites , 2013, SYSTOR '13.

[53]  Gala Yadgar,et al.  One Size Never Fits All: A Flexible Storage Interface for SSDs , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[54]  Yanpei Chen,et al.  Design implications for enterprise storage systems via multi-dimensional trace analysis , 2011, SOSP '11.

[55]  Anil Kashyap,et al.  Workload Characterization for Enterprise Disk Drives , 2018, ACM Trans. Storage.

[56]  Yue Yang,et al.  Analytical modeling of garbage collection algorithms in hotness-aware flash-based solid state drives , 2014, 2014 30th Symposium on Mass Storage Systems and Technologies (MSST).

[57]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[58]  Taejin Kim,et al.  Fully Automatic Stream Management for Multi-Streamed SSDs Using Program Contexts , 2019, FAST.

[59]  Eitan Yaakobi,et al.  An Analysis of Flash Page Reuse With WOM Codes , 2018, ACM Trans. Storage.

[60]  Eitan Yaakobi,et al.  The Devil Is in the Details: Implementing Flash Page Reuse with WOM Codes , 2016, FAST.

[61]  Philippe Bonnet,et al.  uFLIP: Understanding Flash IO Patterns , 2009, CIDR.

[62]  Tao Xie,et al.  I/O Characteristics of Smartphone Applications and Their Implications for eMMC Design , 2015, 2015 IEEE International Symposium on Workload Characterization.

[63]  Vijay Janapa Reddi,et al.  Storage on Your SmartPhone Uses More Energy Than You Think , 2017, HotStorage.

[64]  Peter Desnoyers,et al.  Track-based Translation Layers for Interlaced Magnetic Recording , 2019, USENIX Annual Technical Conference.