A Distributed Architecture for the Monitoring of Clouds and CDNs: Applications to Amazon AWS

Clouds and CDNs are systems that tend to separate the content being requested by users from the physical servers capable of serving it. From the network point of view, monitoring and optimizing performance for the traffic they generate are challenging tasks, given that the same resource can be located in multiple places, which can, in turn, change at any time. The first step in understanding cloud and CDN systems is thus the engineering of a monitoring platform. In this paper, we propose a novel solution that combines passive and active measurements and whose workflow has been tailored to specifically characterize the traffic generated by cloud and CDN infrastructures. We validate our platform by performing a longitudinal characterization of the very well known cloud and CDN infrastructure provider Amazon Web Services (AWS). By observing the traffic generated by more than 50 000 Internet users of an Italian Internet Service Provider, we explore the EC2, S3, and CloudFront AWS services, unveiling their infrastructure, the pervasiveness of web services they host, and their traffic allocation policies as seen from our vantage points. Most importantly, we observe their evolution over a two-year-long period. The solution provided in this paper can be of interest for the following: 1) developers aiming at building measurement tools for cloud infrastructure providers; 2) developers interested in failure and anomaly detection systems; and 3) third-party service-level agreement certificators who can design systems to independently monitor performance. Finally, we believe that the results about AWS presented in this paper are interesting as they are among the first to unveil properties of AWS as seen from the operator point of view.

[1]  Rajkumar Buyya,et al.  2011 Fourth IEEE International Conference on Utility and Cloud Computing SMICloud: A Framework for Comparing and Ranking Cloud Services , 2022 .

[2]  Simson L. Garfinkel,et al.  An Evaluation of Amazon's Grid Computing Services: EC2, S3, and SQS , 2007 .

[3]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  Dario Rossi,et al.  Experiences of Internet traffic monitoring with tstat , 2011, IEEE Network.

[5]  D. Martin Swany,et al.  PerfSONAR: A Service Oriented Architecture for Multi-domain Network Monitoring , 2005, ICSOC.

[6]  Stefan Tai,et al.  What Are You Paying For? Performance Benchmarking for Infrastructure-as-a-Service Offerings , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[7]  Anja Feldmann,et al.  Revisiting Cacheability in Times of User Generated Content , 2010, 2010 INFOCOM IEEE Conference on Computer Communications Workshops.

[8]  Paul V. Mockapetris,et al.  Domain names - implementation and specification , 1987, RFC.

[9]  A. Finamore,et al.  The need for an intelligent measurement plane: The example of time-variant CDN policies , 2012, 2012 15th International Telecommunications Network Strategy and Planning Symposium (NETWORKS).

[10]  Wolfgang Mühlbauer,et al.  Comparing DNS resolvers in the wild , 2010, IMC '10.

[11]  Andrew Fox,et al.  C-MART: Benchmarking the Cloud , 2013, IEEE Transactions on Parallel and Distributed Systems.

[12]  FdidaSerge,et al.  Constraint-based geolocation of internet hosts , 2006 .

[13]  Marco Mellia,et al.  mPlane: an intelligent measurement plane for the internet , 2014, IEEE Communications Magazine.

[14]  Jennifer Widom,et al.  PARALLEL AND DISTRIBUTED SYSTEMS , 2010 .

[15]  Ernst W. Biersack,et al.  Analyzing the impact of YouTube delivery policies on user experience , 2012, 2012 24th International Teletraffic Congress (ITC 24).

[16]  Edward Walker,et al.  Benchmarking Amazon EC2 for High-Performance Scientific Computing , 2008, login Usenix Mag..

[17]  Chris Rose,et al.  A Break in the Clouds: Towards a Cloud Definition , 2011 .

[18]  Steve Uhlig,et al.  IP geolocation databases: unreliable? , 2011, CCRV.

[19]  Keqiang He,et al.  Next stop, the cloud: understanding modern web service deployment in EC2 and azure , 2013, Internet Measurement Conference.

[20]  Marco Mellia,et al.  DNS to the rescue: discerning content and services in a tangled web , 2012, IMC '12.

[21]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[22]  Xiaowei Yang,et al.  CloudCmp: comparing public cloud providers , 2010, IMC '10.

[23]  Serge Fdida,et al.  Constraint-Based Geolocation of Internet Hosts , 2004, IEEE/ACM Transactions on Networking.

[24]  Scott Hazelhurst,et al.  Scientific computing using virtual high-performance computing: a case study using the Amazon elastic computing cloud , 2008, SAICSIT '08.