Right-Sizing Geo-distributed Data Centers for Availability and Latency

We show cloud developers how to right size data center (DC) capacity for geo-distributed applications deployed on several multi-megawatt DCs, possibly also using many smaller edge DCs. Note that capacity considerations for a geo-distributed infrastructure do not decompose into individual DC capacity planning. When edge DCs are used, heterogeneous availability and costs affect the capacity split between the edge and core DCs. Non-uniform spatial distribution of clients and interdependence between latency and availability constraints make it non-trivial to provision the right capacity at each DC. We develop a geo-distributed capacity planning framework to capture the key factors that influence capacity, ranging from application demand patterns, latency and availability requirements, DC cost-availability trade-offs, and data replication overheads. We apply our framework to a realistic application and DC infrastructure setting to gather insights into how capacity should be provisioned and allocated across DCs for a representative set of requirements and costs.

[1]  Ron Kohavi,et al.  Practical guide to controlled experiments on the web: listen to your customers not to the hippo , 2007, KDD '07.

[2]  Brunilde Sansò,et al.  Optimal Location of Data Centers and Software Components in Cloud Computing Network Design , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[3]  Alan L. Cox,et al.  Adaptive parallelism for web search , 2013, EuroSys '13.

[4]  Andrew Warfield,et al.  SecondSite: disaster tolerance as a service , 2012, VEE '12.

[5]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[6]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[7]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[8]  Swarun Kumar,et al.  A cloud-assisted design for autonomous driving , 2012, MCC '12.

[9]  Ralf Steinmetz,et al.  QoS-Aware, Cost-Efficient Selection of Cloud Data Centers , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[10]  Alec Wolman,et al.  Volley: Automated Data Placement for Geo-Distributed Cloud Services , 2010, NSDI.

[11]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[12]  Jordi Torres,et al.  Intelligent Placement of Datacenters for Internet Services , 2011, 2011 31st International Conference on Distributed Computing Systems.

[13]  Mohit Tawarmalani,et al.  Performance Sensitive Replication in Geo-distributed Cloud Datastores , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[14]  Aman Kansal,et al.  Using Dark Fiber to Displace Diesel Generators , 2013, HotOS.

[15]  Wayne D. Gray,et al.  Milliseconds Matter: an Introduction to Microstrategies and to Their Use in Describing and Predicting Interactive Behavior Milliseconds Matter: an Introduction to Microstrategies and to Their Use in Describing and Predicting Interactive Behavior , 2022 .

[16]  Anand Sivasubramaniam,et al.  Towards a Leaner Geo-distributed Cloud Infrastructure , 2014, HotCloud.

[17]  Evangelos Markakis,et al.  Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP , 2002, JACM.

[18]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[19]  Richard L. Church,et al.  The maximal covering location problem , 1974 .