Fast and Cautious Evolution of Cloud Storage

When changing a storage system, the stakes are high. Any modification can undermine stability, causing temporary downtime, a permanent loss of data, and still worse - a loss of user confidence. This results in a cautious conservatism among storage developers. On one hand, the risks do justify taking great care with storage system changes. On the other hand, this slow and cautious deployment attitude is a poor match for cloud services tied closely to web-based frontends that follow an "always beta" mantra. Unlike traditional enterprise servers, cloud-based systems are still exploring what facilities should be provided by the storage layer, requiring that storage services be able to evolve as quickly as the applications that consume them. In this paper, we argue that by building support for evolution into the basic structure of a storage system, new features (and fixes) can be deployed in a fast and cautious manner. We summarize our experiences in developing such a system and detail its requirements and design. We also share some initial experience in deploying it on a rapidly evolving, but production, cloud hosting service that we have been building at UBC.

[1]  J. Chris Anderson,et al.  CouchDB - The Definitive Guide: Time to Relax , 2010 .

[2]  J. Chris Anderson,et al.  CouchDB: The Definitive Guide , 2010 .

[3]  Steve R. Kleiman,et al.  SnapMirror: File-System-Based Asynchronous Mirroring for Disaster Recovery , 2002, FAST.

[4]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[5]  Andrea C. Arpaci-Dusseau,et al.  Membrane: Operating system support for restartable file systems , 2010, TOS.

[6]  Andrew Warfield,et al.  Facilitating the Development of Soft Devices , 2005, USENIX Annual Technical Conference, General Track.

[7]  Gregory R. Ganger,et al.  Comparison-Based File Server Verification , 2005, USENIX Annual Technical Conference, General Track.

[8]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[9]  Dutch T. Meyer,et al.  Parallax: virtual disks for virtual machines , 2008, Eurosys '08.

[10]  Michael Burrows,et al.  Proceedings of Fast '03: 2nd Usenix Conference on File and Storage Technologies 2nd Usenix Conference on File and Storage Technologies Block-level Security for Network-attached Disks , 2022 .

[11]  Tudor Dumitras,et al.  Toward upgrades-as-a-service in distributed systems , 2009, Middleware.

[12]  Nikolai Joukov,et al.  Increasing distributed storage survivability with a stackable RAID-like file system , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[13]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[14]  Andrea C. Arpaci-Dusseau,et al.  Tolerating File-System Mistakes with EnvyFS , 2009, USENIX Annual Technical Conference.