Flamingo: Enabling Evolvable HDD-based Near-Line Storage

Cloud providers and companies running large-scale data centers offer near-line, cold, and archival data storage, which trade access latency and throughput performance for cost. These often require physical rack-scale storage designs, e.g. Facebook/Open Compute Project (OCP) Cold Storage or Pelican, which co-design the hardware, mechanics, power, cooling and software to minimize costs to support the desired workload. A consequence is that the rack resources are restricted, requiring a software stack that can operate within the provided resources. The co-design makes it hard to understand the end-to-end performance impact of relatively small physical design changes and, worse, the software stacks are brittle to these changes. Flamingo supports the design of near-line HDD-based storage racks for cloud services. It requires a physical rack design, a set of resource constraints, and some target performance characteristics. Using these Flamingo is able to automatically parameterize a generic storage stack to allow it to operate on the physical rack. It is also able to efficiently explore the performance impact of varying the rack resources. It incorporates key principles learned from the design and deployment of cold storage systems. We demonstrate that Flamingo can rapidly reduce the time taken to design custom racks to support near-line storage.

[1]  Hari Balakrishnan,et al.  TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.

[2]  Arif Merchant,et al.  Using attribute-managed storage to achieve QoS , 1997 .

[3]  Austin Donnelly,et al.  Sierra: practical power-proportionality for data center storage , 2011, EuroSys '11.

[4]  Eric Anderson,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Hippodrome: Running Circles around Storage Administration , 2022 .

[5]  Gregory R. Ganger,et al.  Towards Self-Predicting Systems: What If You Could Ask "What-If"? , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[6]  Arif Merchant,et al.  Minerva: An automated resource provisioning tool for large-scale storage systems , 2001, TOCS.

[7]  John Wilkes Traveling to Rome: a retrospective on the journey , 2009, OPSR.

[8]  Gregory R. Ganger,et al.  Early experiences on the journey towards self-* storage , 2006, IEEE Data Eng. Bull..

[9]  Jin Qian,et al.  PARAID: A gear-shifting power-aware RAID , 2007, TOS.

[10]  Yuanyuan Zhou,et al.  Hibernator: helping disk arrays sleep through the winter , 2005, SOSP '05.

[11]  Eric Anderson,et al.  Quickly finding near-optimal storage designs , 2005, TOCS.

[12]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[13]  Eric Anderson,et al.  Selecting RAID Levels for Disk Arrays , 2002, FAST.

[14]  Stephen B. Wicker,et al.  Reed-Solomon Codes and Their Applications , 1999 .

[15]  Dirk Grunwald,et al.  Massive Arrays of Idle Disks For Storage Archives , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[16]  Amin Vahdat,et al.  scc: cluster storage provisioning informed by application characteristics and SLAs , 2012, FAST.

[17]  Ken Kennedy,et al.  Automatic data layout for distributed-memory machines , 1998, TOPL.

[18]  Christos Faloutsos,et al.  Using Utility to Provision Storage Systems , 2008, FAST.

[19]  Antony I. T. Rowstron,et al.  Pelican: A Building Block for Exascale Cold Data Storage , 2014, OSDI.

[20]  Julie Ward,et al.  Appia: Automatic Storage Area Network Fabric Design , 2002, FAST.

[21]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[22]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[23]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[24]  Dirk Beyer,et al.  Designing for Disasters , 2004, FAST.

[25]  Ethan L. Miller,et al.  Pergamum: Replacing Tape with Energy Efficient, Reliable, Disk-Based Archival Storage , 2008, FAST.

[26]  Khalil Amiri,et al.  Automatic design of storage systems to meet availability requirements , 1996 .