Flash storage disaggregation

PCIe-based Flash is commonly deployed to provide datacenter applications with high IO rates. However, its capacity and bandwidth are often underutilized as it is difficult to design servers with the right balance of CPU, memory and Flash resources over time and for multiple applications. This work examines Flash disaggregation as a way to deal with Flash overprovisioning. We tune remote access to Flash over commodity networks and analyze its impact on workloads sampled from real datacenter applications. We show that, while remote Flash access introduces a 20% throughput drop at the application level, disaggregation allows us to make up for these overheads through resource-efficient scale-out. Hence, we show that Flash disaggregation allows scaling CPU and Flash resources independently in a cost effective manner. We use our analysis to draw conclusions about data and control plane issues in remote storage.

[1]  Charles Loboz,et al.  Cloud Resource Usage—Heavy Tailed Distributions Invalidating Traditional Capacity Planning Models , 2012, Journal of Grid Computing.

[2]  Prashant J. Shenoy,et al.  A Performance Comparison of NFS and iSCSI for IP-Networked Storage , 2004, FAST.

[3]  Hitesh Ballani,et al.  R2C2: A Network Stack for Rack-scale Computers , 2015, Comput. Commun. Rev..

[4]  Arif Merchant,et al.  Projecting disk usage based on historical trends in a cloud environment , 2012, ScienceCloud '12.

[5]  Kai Shen,et al.  FIOS: a fair, efficient flash I/O scheduler , 2012, FAST.

[6]  David G. Andersen,et al.  Using RDMA efficiently for key-value services , 2015, SIGCOMM 2015.

[7]  Byung-Gon Chun,et al.  MegaPipe: A New Programming Interface for Scalable Network I/O , 2012, OSDI.

[8]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[9]  Rino Micheloni,et al.  Inside Solid State Drives (Ssds) , 2012 .

[10]  Gregory R. Ganger,et al.  Argon: Performance Insulation for Shared Storage Servers , 2007, FAST.

[11]  James R. Hamilton,et al.  Internet-scale service infrastructure efficiency , 2009, ISCA '09.

[12]  Christoforos E. Kozyrakis,et al.  Reconciling high server utilization and sub-millisecond quality-of-service , 2014, EuroSys '14.

[13]  Dan Walsh,et al.  Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[14]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[15]  Scott Shenker,et al.  Network support for resource disaggregation in next-generation datacenters , 2013, HotNets.

[16]  Antony I. T. Rowstron,et al.  Pelican: A Building Block for Exascale Cold Data Storage , 2014, OSDI.

[17]  Abhijeet Joglekar,et al.  A scalable and high performance software iSCSI implementation , 2005, FAST'05.

[18]  Bill Carter,et al.  What is the Open Compute Project? , 2018, 2018 Optical Fiber Communications Conference and Exposition (OFC).

[19]  Hemal Shah,et al.  A study of iSCSI extensions for RDMA (iSER) , 2003, NICELI '03.

[20]  Hitesh Ballani,et al.  End-to-end Performance Isolation Through Virtual Datacenters , 2014, OSDI.

[21]  David Hung-Chang Du,et al.  Performance study of iSCSI-based storage subsystems , 2003, IEEE Commun. Mag..

[22]  Mark Handley,et al.  Network stack specialization for performance , 2013, HotNets.

[23]  Thomas F. Wenisch,et al.  Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.

[24]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[25]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[26]  Anees Shaikh,et al.  Performance Isolation and Fairness for Multi-Tenant Cloud Storage , 2012, OSDI.

[27]  Gu-Yeon Wei,et al.  Profiling a warehouse-scale computer , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[28]  Tzi-cker Chiueh,et al.  Secure I/O device sharing among virtual machines on multiple hosts , 2013, ISCA.

[29]  Angelos Bilas,et al.  Performance evaluation of commodity iSCSI-based storage systems , 2005, 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST'05).

[30]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[31]  Irfan Ahmad,et al.  PARDA: Proportional Allocation of Resources for Distributed Storage Access , 2009, FAST.

[32]  Abhishek Verma,et al.  Large-scale cluster management at Google with Borg , 2015, EuroSys.

[33]  Thomas F. Wenisch,et al.  System-level implications of disaggregated memory , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[34]  Randy H. Katz,et al.  Cake: enabling high-level SLOs on shared storage systems , 2012, SoCC '12.

[35]  Julian Satran,et al.  Internet Small Computer Systems Interface (iSCSI) , 2004, RFC.

[36]  Michael J. Freedman,et al.  From application requests to virtual IOPs: provisioned key-value storage with Libra , 2014, EuroSys '14.

[37]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[38]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[39]  Scott Shenker,et al.  Disk-Locality in Datacenter Computing Considered Irrelevant , 2011, HotOS.

[40]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[41]  Dahlia Malkhi,et al.  CORFU: A Shared Log Design for Flash Clusters , 2012, NSDI.

[42]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[43]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[44]  GhemawatSanjay,et al.  The Google file system , 2003 .

[45]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[46]  Dutch T. Meyer,et al.  Strata: High-Performance Scalable Storage on Virtualized Non-volatile Memory , 2014, FAST 2014.

[47]  Asim Kadav,et al.  Blizzard: Fast, Cloud-scale Block Storage for Cloud-oblivious Applications , 2014, NSDI.

[48]  Arif Merchant,et al.  Using MEMS-Based Storage in Disk Arrays , 2003, FAST.

[49]  Dutch T. Meyer,et al.  Strata: scalable high-performance storage on virtualized non-volatile memory , 2014, FAST.

[50]  Robert Tappan Morris,et al.  Improving network connection locality on multicore systems , 2012, EuroSys '12.

[51]  Christoforos E. Kozyrakis,et al.  IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.

[52]  Eunyoung Jeong,et al.  mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.

[53]  Andrew Warfield,et al.  Parallax: Managing Storage for a Million Machines , 2005, HotOS.