Take me to your leader! Online Optimization of Distributed Storage Configurations

The configuration of a distributed storage system typically includes, among other parameters, the set of servers and their roles in the replication protocol. Although mechanisms for changing the configuration at runtime exist, it is usually left to system administrators to manually determine the "best" configuration and periodically reconfigure the system, often by trial and error. This paper describes a new workload-driven optimization framework that dynamically determines the optimal configuration at run-time. We focus on optimizing leader and quorum based replication schemes and divide the framework into three optimization tiers, dynamically optimizing different configuration aspects: 1) leader placement, 2) roles of different servers in the replication protocol, and 3) replica locations. We showcase our optimization framework by applying it to a large-scale distributed storage system used internally in Google and demonstrate that most client applications significantly benefit from using our framework, reducing average operation latency by up to 94%.

[1]  James R. Larus,et al.  Orleans: cloud computing for everyone , 2011, SoCC.

[2]  Dahlia Malkhi Virtually Synchronous Methodology for Dynamic Service Replication , 2010 .

[3]  Alec Wolman,et al.  Volley: Automated Data Placement for Geo-Distributed Cloud Services , 2010, NSDI.

[4]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[5]  Marcos K. Aguilera,et al.  Reconfiguring Replicated Atomic Storage: A Tutorial , 2013, Bull. EATCS.

[6]  Alexander Shraer,et al.  Dynamic Reconfiguration of Primary/Backup Clusters , 2012, USENIX Annual Technical Conference.

[7]  GhemawatSanjay,et al.  The Google file system , 2003 .

[8]  Douglas B. Terry,et al.  A Self-Configurable Geo-Replicated Cloud Storage System , 2014, OSDI.

[9]  Leslie Lamport,et al.  Reconfiguring a state machine , 2010, SIGA.

[10]  Marcos K. Aguilera,et al.  Online Migration for Geo-distributed Storage Systems , 2011, USENIX Annual Technical Conference.

[11]  Michael Stonebraker,et al.  The VoltDB Main Memory DBMS , 2013, IEEE Data Eng. Bull..

[12]  Emin Gün Sirer,et al.  HyperDex: a distributed, searchable key-value store , 2012, SIGCOMM '12.

[13]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[14]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[15]  Jon Howell,et al.  The SMART way to migrate replicated stateful services , 2006, EuroSys.

[16]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[17]  Xiao Ma,et al.  An empirical study on configuration errors in commercial and open source systems , 2011, SOSP.

[18]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[19]  Hector Garcia-Molina,et al.  Where in the world is my data? , 2011, Proc. VLDB Endow..

[20]  Marc Najork,et al.  Boxwood: Abstractions as the Foundation for Storage Infrastructure , 2004, OSDI.

[21]  Dahlia Malkhi,et al.  CORFU: A distributed shared log , 2013, TOCS.

[22]  Sushil Jajodia,et al.  An adaptive data replication algorithm , 1997, TODS.

[23]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.