Everest: Scaling Down Peak Loads Through I/O Off-Loading

Bursts in data center workloads are a real problem for storage subsystems. Data volumes can experience peak I/O request rates that are over an order of magnitude higher than average load. This requires significant overprovisioning, and often still results in significant I/O request latency during peaks. In order to address this problem we propose Everest, which allows data written to an overloaded volume to be temporarily off-loaded into a short-term virtual store. Everest creates the short-term store by opportunistically pooling underutilized storage resources either on a server or across servers within the data center. Writes are temporarily off-loaded from overloaded volumes to lightly loaded volumes, thereby reducing the I/O load on the former. Everest is transparent to and usable by unmodified applications, and does not change the persistence or consistency of the storage system. We evaluate Everest using traces from a production Exchange mail server as well as other benchmarks: our results show a 1.4-70 times reduction in mean response times during peaks.

[1]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[2]  Galen C. Hunt,et al.  Detours: binary interception of Win32 functions , 1999 .

[3]  David Woodhouse,et al.  JFFS : The Journalling Flash File System , 2001 .

[4]  Lidong Zhou,et al.  Transactional Flash , 2008, OSDI.

[5]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.

[6]  Gregory R. Ganger,et al.  Association Proceedings of the Third USENIX Conference on File and Storage Technologies San Francisco , CA , USA March 31 – April 2 , 2004 , 2004 .

[7]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[8]  William J. Bolosky,et al.  Progress-based regulation of low-importance processes , 1999, SOSP.

[9]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[10]  Carl Staelin,et al.  An Implementation of a Log-Structured File System for UNIX , 1993, USENIX Winter.

[11]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[12]  Eric Anderson,et al.  Quickly finding near-optimal storage designs , 2005, TOCS.

[13]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[14]  Gregory R. Ganger,et al.  Awarded Best Student Paper! -- A Framework for Building Unobtrusive Disk Maintenance Applications , 2004 .

[15]  Carl Staelin,et al.  Idleness is Not Sloth , 1995, USENIX.

[16]  GhemawatSanjay,et al.  The Google file system , 2003 .

[17]  Christos Faloutsos,et al.  Using Utility to Provision Storage Systems , 2008, FAST.

[18]  Miguel Castro,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OPSR.

[19]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[20]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[21]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[22]  John Wilkes,et al.  Traveling to Rome: QoS Specifications for Automated Storage System Management , 2001, IWQoS.

[23]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[24]  H. Apte,et al.  Serverless Network File Systems , 2006 .

[25]  Arif Merchant,et al.  FAB: building distributed enterprise disk arrays from commodity components , 2004, ASPLOS XI.

[26]  Carl Staelin,et al.  The HP AutoRAID hierarchical storage system , 1995, SOSP.