Cloud Federation and Geo-Distribution

The cloud computing paradigm has significantly evolved beyond the simple early application scenarios such as third-party hosting of web servers. This evolution was triggered by the desire of cloud providers to serve diverse needs of customers around the globe. In particular, the term “cloud” was originally put on par with “datacenter”, yet data– and compute-clouds have evolved to complex multi-datacenter infrastructures (see Figure 1). As many cloud-based solutions start to serve customers around the globe or through new media, data may be spread across multiple sites and even cloud offerings for various reasons including low latency retrieval based on geographical proximity, legal constraints, or cost considerations. Regardless of the original motivation, federation across datacenters including especially so-called “geo-distribution” leads to many challenges around (1) the location and access of data stored and shared between datacenters, (2) the computation on such distributed data, and, in general, around (3) the communication of data across datacenters in the context of (1) and (2). This article first motivates federation and then describes the challenges in these three areas and outlines solutions to them.