FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs

A longstanding goal of SSD virtualization has been to provide performance isolation between multiple tenants sharing the device. Virtualizing SSDs, however, has traditionally been a challenge because of the fundamental tussle between resource isolation and the lifetime of the device - existing SSDs aim to uniformly age all the regions of flash and this hurts isolation. We propose utilizing flash parallelism to improve isolation between virtual SSDs by running them on dedicated channels and dies. Furthermore, we offer a complete solution by also managing the wear. We propose allowing the wear of different channels and dies to diverge at fine time granularities in favor of isolation and adjusting that imbalance at a coarse time granularity in a principled manner. Our experiments show that the new SSD wears uniformly while the 99th percentile latencies of storage operations in a variety of multi-tenant settings are reduced by up to 3.1x compared to software isolated virtual SSDs.

[1]  Jie Liu,et al.  SSD Failures in Datacenters: What? When? and Why? , 2016, SYSTOR.

[2]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[3]  Michael I. Jordan,et al.  The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements , 2011, FAST.

[4]  Mahmut T. Kandemir,et al.  Physically addressed queueing (PAQ): Improving parallelism in solid state disks , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[5]  Andrea C. Arpaci-Dusseau,et al.  Split-level I/O scheduling , 2015, SOSP.

[6]  Jason Cong,et al.  An efficient design and implementation of LSM-tree based key-value store on open-channel SSD , 2014, EuroSys '14.

[7]  Dhabaleswar K. Panda,et al.  Beyond block I/O: Rethinking traditional storage primitives , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[8]  Michael J. Freedman,et al.  From application requests to virtual IOPs: provisioned key-value storage with Libra , 2014, EuroSys '14.

[9]  Dongil Park,et al.  Resolving journaling of journal anomaly in android I/O: multi-version B-tree with lazy split , 2014, FAST.

[10]  Jihong Kim,et al.  Application-Managed Flash , 2016, FAST.

[11]  Sangyeun Cho,et al.  The Multi-streamed Solid-State Drive , 2014, HotStorage.

[12]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[13]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[15]  Jian Yang,et al.  Architecting Flash-based Solid-State Drive for High-performance I/O Virtualization , 2014, IEEE Computer Architecture Letters.

[16]  Andrea C. Arpaci-Dusseau,et al.  Reducing File System Tail Latencies with Chopper , 2015, FAST.

[17]  Koji Sato,et al.  The Linux implementation of a log-structured file system , 2006, OPSR.

[18]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[19]  Javier González,et al.  LightNVM: The Linux Open-Channel SSD Subsystem , 2017, FAST.

[20]  Sai Prashanth Muralidhara,et al.  Reducing memory interference in multicore systems via application-aware memory channel partitioning , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  Andrea C. Arpaci-Dusseau,et al.  De-indirection for flash-based SSDs with nameless writes , 2012, FAST.

[22]  Ji Huang,et al.  Schema-Agnostic Indexing with Azure DocumentDB , 2015, Proc. VLDB Endow..

[23]  Nisha Talagala,et al.  Don't Stack Your Log On My Log , 2014, INFLOW.

[24]  Meng Zhu,et al.  Journaling of journal is (almost) free , 2014, FAST.

[25]  Jayant Madhavan,et al.  Comparing SSD-placement strategies to scale a database-in-the-cloud , 2013, SoCC.

[26]  Albert G. Greenberg,et al.  EyeQ: Practical Network Performance Isolation at the Edge , 2013, NSDI.

[27]  Steven Swanson,et al.  Providing safe, user space access to fast, solid state disks , 2012, ASPLOS XVII.

[28]  Karsten Schwan,et al.  Unified address translation for memory-mapped SSDs with FlashMap , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[29]  Anees Shaikh,et al.  Performance Isolation and Fairness for Multi-Tenant Cloud Storage , 2012, OSDI.

[30]  Joo Young Hwang,et al.  F2FS: A New File System for Flash Storage , 2015, FAST.

[31]  Jun Wang,et al.  WOLF - A Novel Reordering Write Buffer to Boost the Performance of Log-Structured File Systems , 2002, FAST.

[32]  Yang Liu,et al.  Willow: A User-Programmable SSD , 2014, OSDI.

[33]  Laxmikant V. Kalé,et al.  A distributed dynamic load balancer for iterative applications , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[34]  Christoforos E. Kozyrakis,et al.  Vantage: Scalable and efficient fine-grain cache partitioning , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[35]  Antony I. T. Rowstron,et al.  IOFlow: a software-defined storage architecture , 2013, SOSP.

[36]  Carlos Maltzahn,et al.  Flash on Rails: Consistent Flash Performance through Redundancy , 2014, USENIX Annual Technical Conference.

[37]  Andrew A. Chien,et al.  The Tail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments , 2016, FAST.

[38]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[39]  Dan Feng,et al.  PSLO: enforcing the Xth percentile latency and throughput SLOs for consolidated VM storage , 2016, EuroSys.

[40]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[41]  David Flynn,et al.  DFS: A file system for virtualized flash storage , 2010, TOS.

[42]  Jihong Kim,et al.  Improving I/O Resource Sharing of Linux Cgroup for NVMe SSDs on Multi-core Systems , 2016, HotStorage.

[43]  Christoforos E. Kozyrakis,et al.  Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[44]  Christoforos E. Kozyrakis,et al.  Reconciling high server utilization and sub-millisecond quality-of-service , 2014, EuroSys '14.

[45]  Jignesh M. Patel,et al.  Re-evaluating designs for multi-tenant OLTP workloads on SSD-basedI/O subsystems , 2014, SIGMOD Conference.

[46]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[47]  Alexandra Fedorova,et al.  A case for NUMA-aware contention management on multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[48]  Aameek Singh,et al.  Server-storage virtualization: Integration and load balancing in data centers , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[49]  Peter J. Varman,et al.  Balancing fairness and efficiency in tiered storage systems with bottleneck-aware allocation , 2014, FAST.

[50]  Sam H. Noh,et al.  Towards SLO Complying SSDs Through OPS Isolation , 2015, FAST.