Real-time End-to-end Network Monitoring in Large Distributed Systems

Measuring real-time end-to-end network path performance metrics is important for several distributed applications such as media streaming systems (e.g., for switching to paths with higher bandwidth and lower jitter) and content distribution systems (e.g., for selecting servers with lower latency). However, it is challenging to perform such end-to-end pairwise measurements in large distributed systems while achieving high accuracy and avoid interfering with existing traffic. On the end hosts, the measurements can overload the machine by causing interference among themselves and other processes. On the network, the measurement packets from different hosts can interfere among themselves and with other flows on bottleneck links. In this paper, we propose a system to monitor end-host and network resources and adapt the number of measurements according to the observed load. Our scheme avoids interference by measuring only a small subset of network paths and reconstructing the entire network path properties from the partial, indirect measurements. Our simulation experiments and real testbed experiments on PlanetLab show that our path selection algorithm working with resource constraints does not adversely affect the accuracy of inference and our system can effectively adapt to the changing resource usage at the end hosts.

[1]  Peter Steenkiste,et al.  Exploiting internet route sharing for large scale available bandwidth estimation , 2005, IMC '05.

[2]  Larry L. Peterson,et al.  Reliability and Security in the CoDeeN Content Distribution Network , 2004, USENIX Annual Technical Conference, General Track.

[3]  Jon Crowcroft,et al.  Lighthouses for Scalable Distributed Location , 2003, IPTPS.

[4]  Walid Dabbous,et al.  Landmark-Based End-to-End Bandwidth Inference , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[5]  Zhao Wen-tao,et al.  Efficient available bandwidth estimation for network paths , 2008 .

[6]  Sujata Banerjee,et al.  S3: a scalable sensing service for monitoring large networked systems , 2006, INM '06.

[7]  Randy H. Katz,et al.  Tomography-based overlay network monitoring , 2003, IMC '03.

[8]  Krishna P. Gummadi,et al.  King: estimating latency between arbitrary internet end hosts , 2002, IMW '02.

[9]  Yin Zhang,et al.  NetQuest: A Flexible Framework for Large-Scale Network Measurement , 2009, IEEE/ACM Transactions on Networking.

[10]  Paul Francis,et al.  IDMaps: a global internet host distance estimation service , 2001, TNET.

[11]  Parameswaran Ramanathan,et al.  Packet Dispersion Techniques and Capacity Estimation , 2004 .

[12]  Mark Handley,et al.  Topologically-aware overlay construction and server selection , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[13]  Sridhar Srinivasan,et al.  M-coop: a scalable infrastructure for network measurement , 2003, Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003.

[14]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[15]  Sujata Banerjee,et al.  Estimating network proximity and latency , 2006, CCRV.

[16]  M. Frans Kaashoek,et al.  A measurement study of available bandwidth estimation tools , 2003, IMC '03.

[17]  Hui Zhang,et al.  Predicting Internet network distance with coordinates-based approaches , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[18]  Randy H. Katz,et al.  An algebraic approach to practical and scalable overlay network monitoring , 2004, SIGCOMM '04.

[19]  Richard G. Baraniuk,et al.  pathChirp: Efficient available bandwidth estimation for network paths , 2003 .

[20]  Parameswaran Ramanathan,et al.  Packet-dispersion techniques and a capacity-estimation methodology , 2004, IEEE/ACM Transactions on Networking.

[21]  Krishna P. Gummadi,et al.  King: estimating latency between arbitrary internet end hosts , 2002, IMW '02.

[22]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.