Volley: Automated Data Placement for Geo-Distributed Cloud Services

As cloud services grow to span more and more globally distributed datacenters, there is an increasingly urgent need for automated mechanisms to place application data across these datacenters. This placement must deal with business constraints such as WAN bandwidth costs and datacenter capacity limits, while also minimizing user-perceived latency. The task of placement is further complicated by the issues of shared data, data inter-dependencies, application changes and user mobility. We document these challenges by analyzing month-long traces from Microsoft's Live Messenger and Live Mesh, two large-scale commercial cloud services. We present Volley, a system that addresses these challenges. Cloud services make use of Volley by submitting logs of datacenter requests. Volley analyzes the logs using an iterative optimization algorithm based on data access patterns and client locations, and outputs migration recommendations back to the cloud service. To scale to the data volumes of cloud service logs, Volley is designed to work in SCOPE [5], a scalable MapReduce-style platform; this allows Volley to perform over 400 machine-hours worth of computation in less than a day. We evaluate Volley on the month-long Live Mesh trace, and we find that, compared to a state-of-the-art heuristic that places data closest to the primary IP address that accesses it, Volley simultaneously reduces datacenter capacity skew by over 2×, reduces inter-datacenter traffic by over 1.8× and reduces 75th percentile user-latency by over 30%.

[1]  David P. Williamson,et al.  Improved approximation algorithms for capacitated facility location problems , 1999, IPCO.

[2]  Alec Wolman,et al.  Centrifuge: Integrated Lease Management and Partitioning for Cloud Services , 2010, NSDI.

[3]  Ramesh Govindan,et al.  Reliable and efficient programming abstractions for wireless sensor networks , 2007, PLDI '07.

[4]  Magnus Karlsson,et al.  Do We Need Replica Placement Algorithms in Content Delivery Networks , 2002 .

[5]  Andries van Dam,et al.  Experience with distributed processing on a host/satellite graphics system , 1976, SIGGRAPH.

[6]  Benjamin Livshits,et al.  Doloto: code splitting for network-bound web 2.0 applications , 2008, SIGSOFT '08/FSE-16.

[7]  Fan Yang,et al.  Hilda: A High-Level Language for Data-DrivenWeb Applications , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  Jacob R. Lorch,et al.  Matchmaking for online games and other latency-sensitive P2P systems , 2009, SIGCOMM '09.

[9]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[10]  Rimon Barr,et al.  Design and implementation of a single system image operating system for ad hoc networks , 2005, MobiSys '05.

[11]  Andrew P. Black,et al.  Fine-grained mobility in the Emerald system , 1987, TOCS.

[12]  Sudipto Guha,et al.  Improved combinatorial algorithms for the facility location and k-median problems , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[13]  Galen C. Hunt,et al.  The Coign automatic distributed partitioning system , 1999, OSDI '99.

[14]  Marc Shapiro,et al.  SOS: An Object-Oriented Operating System - Assessment and Perspectives , 1989, Comput. Syst..

[15]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[16]  Satish Rao,et al.  Expander flows, geometric embeddings and graph partitioning , 2004, STOC '04.

[17]  Christopher Stewart,et al.  Profile-Driven Component Placement for Cluster-Based Online Services , 2004, IEEE Distributed Syst. Online.

[18]  Yannis Smaragdakis,et al.  J-Orchestra: Automatic Java Application Partitioning , 2002, ECOOP.

[19]  Andrew S. Grimshaw,et al.  The core Legion object model , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[20]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[21]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[22]  Shigeru Chiba,et al.  A Bytecode Translator for Distributed Execution of "Legacy" Java Software , 2001, ECOOP.

[23]  James D. Foley,et al.  Configurable applications for graphics employing satellites (CAGES) , 1975, SIGGRAPH '75.

[24]  Andrew S. Tanenbaum,et al.  Globe: a wide area distributed system , 1999, IEEE Concurr..

[25]  Jason Flinn,et al.  Slingshot: deploying stateful services in wireless hotspots , 2005, MobiSys.

[26]  KnxDT Microsoft Office Online , 2010 .

[27]  Yao Zhao,et al.  BotGraph: Large Scale Spamming Botnet Detection , 2009, NSDI.

[28]  Sivan Toledo,et al.  Wishbone: Profile-based Partitioning for Sensornet Applications , 2009, NSDI.

[29]  Jure Leskovec,et al.  Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network , 2007, WWW 2008.

[30]  Gregory R. Ganger,et al.  Dynamic Function Placement for Data-Intensive Cluster Computing , 2000, USENIX Annual Technical Conference, General Track.

[31]  Robert Tappan Morris,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM '04.

[32]  Mahadev Satyanarayanan,et al.  Balancing performance, energy, and quality in pervasive computing , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[33]  V. T. Rajan,et al.  Dynamic Application Partitioning in VisualAge Generator Version 3.0 , 1998, ECOOP Workshops.

[34]  Jingren Zhou,et al.  SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..

[35]  Yuval Shavitt,et al.  Constrained mirror placement on the Internet , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).