An Empirical Study on the Interplay between Filesystems and SSD

This study presents a comprehensive empirical investigation on the interplay between actively-deployed file systems and an SSD. We test and analyze the performance of four widely used Linux file systems, which are ext2, ext3, XFS and Reiserfs, on an SSD with a range of different workloads. It is primarily intended to serve two purposes. One is that considering its widespread adoption trend, we are realistically motivated to have first-hand numbers of the actual performance of the emerging storage technology, especially in the contexts of daily deployment scenarios. The other goal is that we attempt to disclose the internal details behind the SSD's thin interface from a high-level perspective, totally different than those previous studies, which are typically micro-testonly. As a result of this study, we obtain several interesting and useful findings: (1) Generally, different file systems perform disparately on the SSD due to their various design principles, sometimes even with up to one order of magnitude of performance discrepancy. (2) File system format/mount options and workload characteristics have significant impacts on performance. (3) SSD would deliver optimal performance if used in a friendly manner. (4) Workloads, file systems and SSD interact in an intrinsically complicated way and in order to have optimal synergistic performance anticipation, users should seriously consider all of the three factors together, when setting up their SSD-based storage systems.

[1]  Hong Jiang,et al.  PUD-LRU: An Erase-Efficient Write Buffer Management Algorithm for Flash Memory SSD , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[2]  Hong Jiang,et al.  Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity , 2011, ICS '11.

[3]  Erez Zadok,et al.  Evaluating Performance and Energy in File System Server Workloads , 2010, FAST.

[4]  Peter Desnoyers,et al.  Write Endurance in Flash Drives: Measurements and Analysis , 2010, FAST.

[5]  Mahesh Balakrishnan,et al.  Depletable Storage Systems , 2010, HotStorage.

[6]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[7]  Lei Zhang,et al.  S-FTL: An efficient address translation for flash memory by exploiting spatial locality , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[8]  Xiaoning Ding,et al.  DULO: an effective buffer cache management scheme to exploit both temporal and spatial locality , 2005, FAST'05.

[9]  Suman Nath,et al.  Cheap and Large CAMs for High Performance Data-Intensive Networked Systems , 2010, NSDI.

[10]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .

[11]  Jin Li,et al.  ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory , 2010, USENIX Annual Technical Conference.

[12]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[13]  Ke Zhou,et al.  ShiftFlash: Make flash-based storage more resilient and robust , 2011, Perform. Evaluation.

[14]  Qing Yang,et al.  I-CASH: Intelligently Coupled Array of SSD and HDD , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[15]  Paul H. Siegel,et al.  Characterizing flash memory: Anomalies, observations, and applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Jongmoo Choi,et al.  SSD Characterization: From Energy Consumption's Perspective , 2011, HotStorage.

[17]  M. Polte,et al.  Comparing performance of solid state devices and mechanical disks , 2008, 2008 3rd Petascale Data Storage Workshop.

[18]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[19]  Hyojun Kim,et al.  BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage , 2008, FAST.

[20]  Alexander S. Szalay,et al.  Performance modeling and analysis of flash-based storage devices , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[21]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[22]  Steven Swanson,et al.  Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications , 2009, ASPLOS.

[23]  Song Jiang,et al.  CLOCK-Pro: An Effective Improvement of the CLOCK Replacement , 2005, USENIX ATC, General Track.

[24]  Feng Chen,et al.  Hystor: making the best use of solid state drives in high performance storage systems , 2011, ICS '11.

[25]  Dan Feng,et al.  Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[26]  David J. Lilja,et al.  High performance solid state storage under Linux , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[27]  Mahesh Balakrishnan,et al.  Extending SSD Lifetimes with Disk-Based Write Caches , 2010, FAST.

[28]  Nikolai Joukov,et al.  A nine year study of file system and storage benchmarking , 2008, TOS.

[29]  Sang-Won Lee,et al.  Design of flash-based DBMS: an in-page logging approach , 2007, SIGMOD '07.

[30]  Vivek S. Pai,et al.  SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy , 2011, NSDI.

[31]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[32]  Xiaoning Ding,et al.  DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch , 2007, USENIX Annual Technical Conference.

[33]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.