Network monitoring and diagnosis based on available bandwidth measurement

Network monitoring and diagnosis systems are used by ISPs for daily network management operations and by popular network applications like peer-to-peer systems for performance optimization. However, the high overhead of some monitoring and diagnostic techniques can limit their applicability. This is for example the case for end-to-end available bandwidth estimation: tools previously developed for available bandwidth monitoring and diagnosis often have high overhead and are difficult to use. This dissertation puts forth the claim that end-to-end available bandwidth and bandwidth bottlenecks can be efficiently and effectively estimated using packet-train probing techniques. By using source and sink tree structures that can capture network edge information, and with the support of a properly designed measurement infrastructure, bandwidth-related measurements can also be scalable and convenient enough to be used routinely by both ISPs and regular end users. These claims are supported by four techniques presented in this dissertation: the IGI/PTR end-to-end available bandwidth measurement technique, the Pathneck bottleneck locating technique, the BRoute large-scale available bandwidth inference system, and the TAMI monitoring and diagnostic infrastructure. The IGI/PTR technique implements two available-bandwidth measurement algorithms, estimating background traffic load (IGI) and packet transmission rate (PTR), respectively. It demonstrates that end-to-end available bandwidth can be measured both accurately and efficiently, thus solving the path-level available-bandwidth monitoring problem. The Pathneck technique uses a carefully constructed packet train to locate bottleneck links, making it easier to diagnose available-bandwidth related problems. Pathneck only needs single-end control and is extremely light-weight. Those properties make it attractive for both regular network users and ISP network operators. The BRoute system uses a novel concept---source and sink trees---to capture end-user routing structures and network-edge bandwidth information. Equipped with path-edge inference algorithms, BRoute can infer the available bandwidth of all N2 paths in an N-node system with only O(N) measurement overhead. That is, BRoute solves the system-level available-bandwidth monitoring problem. The TAMI measurement infrastructure introduces measurement scheduling and topology-aware capabilities to systematically support all the monitoring and diagnostic techniques that are proposed in this dissertation. TAMI not only can support network monitoring and diagnosis, it also can effectively improve the performance of network applications like peer-to-peer systems.

[1]  Mark Claypool,et al.  Inferring Queue Sizes in Access Networks by Active Measurement , 2004, PAM.

[2]  Darryl Veitch,et al.  Active probing using packet quartets , 2002, IMW '02.

[3]  Yin Zhang,et al.  On AS-level path inference , 2005, SIGMETRICS '05.

[4]  W. Richard Stevens,et al.  Unix network programming , 1990, CCRV.

[5]  Miguel Castro,et al.  PIC: practical Internet coordinates for distance estimation , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[6]  Federico Montesino-Pouzols,et al.  Comparative Analysis of Active Bandwidth Estimation Tools , 2004, PAM.

[7]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[8]  Srinivasan Keshav,et al.  Packet-Pair Flow Control , 2003 .

[9]  Vern Paxson,et al.  Measurements and analysis of end-to-end Internet dynamics , 1997 .

[10]  Van Jacobson,et al.  Congestion avoidance and control , 1988, SIGCOMM '88.

[11]  Stefan Savage,et al.  Sting: A TCP-based Network Measurement Tool , 1999, USENIX Symposium on Internet Technologies and Systems.

[12]  Farnam Jahanian,et al.  Experimental study of Internet stability and backbone failures , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[13]  Ratul Mahajan,et al.  User-level internet path diagnosis , 2003, SOSP '03.

[14]  Srinivasan Seshan,et al.  A network measurement architecture for adaptive applications , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[15]  Donald F. Towsley,et al.  Modeling TCP throughput: a simple model and its empirical validation , 1998, SIGCOMM '98.

[16]  Dmitri Loguinov,et al.  Multi-hop probing asymptotics in available bandwidth estimation: stochastic analysis , 2005, IMC '05.

[17]  Van Jacobson,et al.  A tool to infer characteristics of internet paths , 1997 .

[18]  B. A. Mar,et al.  pchar : A Tool for Measuring Internet Path Characteristics , 2000 .

[19]  Jia Wang,et al.  Locating internet bottlenecks: algorithms, measurements, and implications , 2004, SIGCOMM '04.

[20]  Jia Wang,et al.  A measurement study of Internet bottlenecks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[21]  Dean Sutherland,et al.  The architecture of the Remos system , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[22]  Richard G. Baraniuk,et al.  pathChirp: Efficient available bandwidth estimation for network paths , 2003 .

[23]  Yin Zhang,et al.  BGP routing stability of popular destinations , 2002, IMW '02.

[24]  Anja Feldmann,et al.  NetScope: traffic engineering for IP networks , 2000, IEEE Netw..

[25]  Thomas Erlebach,et al.  Computing the types of the relationships between autonomous systems , 2007, IEEE/ACM Trans. Netw..

[26]  Peter Steenkiste,et al.  Evaluation and characterization of available bandwidth probing techniques , 2003, IEEE J. Sel. Areas Commun..

[27]  Azer Bestavros,et al.  Measuring bottleneck bandwidth of targeted path segments , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[28]  Larry L. Peterson,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation the Effectiveness of Request Redirection on Cdn Robustness , 2022 .

[29]  Anja Feldmann,et al.  Locating internet routing instabilities , 2004, SIGCOMM '04.

[30]  Brian Tierney,et al.  System capability effects on algorithms for network bandwidth measurement , 2003, IMC '03.

[31]  Farnam Jahanian,et al.  Internet routing instability , 1997, SIGCOMM '97.

[32]  Jia Wang,et al.  Optimizing network performance in replicated hosting , 2005, 10th International Workshop on Web Content Caching and Distribution (WCW'05).

[33]  W. Matthews,et al.  Internet end-to-end performance monitoring for the High Energy Nuclear and Particle Physics community , 2000 .

[34]  Randy H. Katz,et al.  An algebraic approach to practical and scalable overlay network monitoring , 2004, SIGCOMM '04.

[35]  Mark Crovella,et al.  Efficient algorithms for large-scale topology discovery , 2004, SIGMETRICS '05.

[36]  Yin Zhang,et al.  The Stationarity of Internet Path Properties: Routing, Loss, and Throughput , 2000 .

[37]  Paul Barford,et al.  Improving accuracy in end-to-end packet loss measurement , 2005, SIGCOMM '05.

[38]  David Watson,et al.  Topology aware overlay networks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[39]  Mark Allman,et al.  A Scalable System for Sharing Internet Measurements , 2007 .

[40]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[41]  Kostas G. Anagnostakis,et al.  cing: measuring network-internal delays using only existing infrastructure , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[42]  Yuval Shavitt,et al.  Big-bang simulation for embedding network distances in Euclidean space , 2004, IEEE/ACM Transactions on Networking.

[43]  Balachander Krishnamurthy,et al.  ATMEN: a triggered network measurement infrastructure , 2005, WWW '05.

[44]  Jon Postel,et al.  Internet Control Message Protocol , 1981, RFC.

[45]  Anthony Danalis Anemos : An autonomous network monitoring system , 2003 .

[46]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[47]  Anees Shaikh,et al.  Positioning Relay Nodes in ISP Networks , 2005, INFOCOM 2005.

[48]  Manish Jain,et al.  End-to-end available bandwidth: measurement methodology, dynamics, and relation with TCP throughput , 2003, TNET.

[49]  Srinivasan Seshan,et al.  SPAND: Shared Passive Network Performance Discovery , 1997, USENIX Symposium on Internet Technologies and Systems.

[50]  Robert Tappan Morris,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM '04.

[51]  Ramesh Govindan,et al.  Estimating Router ICMP Generation Delays , 2002 .

[52]  Manish Jain,et al.  Effects of Interrupt Coalescence on Network Measurements , 2004, PAM.

[53]  Jibin Zhan,et al.  Early Experience with an Internet Broadcast System Based on Overlay Multicast , 2004, USENIX Annual Technical Conference, General Track.

[54]  Jia Wang,et al.  Towards an accurate AS-level traceroute tool , 2003, SIGCOMM '03.

[55]  Mats Björkman,et al.  A new end-to-end probing and analysis method for estimating bandwidth bottlenecks , 2000, Globecom '00 - IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137).

[56]  Peter Steenkiste,et al.  Exploiting internet route sharing for large scale available bandwidth estimation , 2005, IMC '05.

[57]  Yin Zhang,et al.  On the constancy of internet path properties , 2001, IMW '01.

[58]  George F. Riley,et al.  NETI@home: A Distributed Approach to Collecting End-to-End Network Performance Measurements , 2004, PAM.

[59]  kc claffy,et al.  Bandwidth estimation: metrics, measurement techniques, and tools , 2003, IEEE Netw..

[60]  Hui Zhang,et al.  Predicting Internet network distance with coordinates-based approaches , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[61]  Ratul Mahajan,et al.  Colt ? ? ? ? ? ? ◦ DTAG ? ◦ • ◦ ? ? ? ? ! ◦ ? ? ? ◦ ◦ ? ? Eqip ? ? ? ? ? ? , 2003 .

[62]  Philippe Owezarski,et al.  Design and Deployment of a Passive Monitoring Infrastructure , 2001, IWDC.

[63]  kc claffy,et al.  The nature of the beast: Recent traffic measurements from an Internet backbone , 1998 .

[64]  Randy H. Katz,et al.  Characterizing the Internet hierarchy from multiple vantage points , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[65]  Jeremy Stribling,et al.  Examining the Tradeoffs of Structured Overlays in a Dynamic Non-Transitive Network , 2003 .

[66]  Mark Crovella,et al.  Measuring Bottleneck Link Speed in Packet-Switched Networks , 1996, Perform. Evaluation.

[67]  Ratul Mahajan,et al.  Measuring ISP topologies with rocketfuel , 2002, SIGCOMM 2002.

[68]  G. Di Battista,et al.  Computing the types of the relationships between autonomous systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[69]  Jean-Chrysostome Bolot,et al.  End-to-end packet delay and loss behavior in the internet , 1993, SIGCOMM '93.

[70]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[71]  Jon Crowcroft,et al.  Lighthouses for Scalable Distributed Location , 2003, IPTPS.

[72]  Mary Baker,et al.  Nettimer: A Tool for Measuring Bottleneck Link Bandwidth , 2001, USITS.

[73]  V. Monova Internet Routing Stability , 2003 .

[74]  Lili Qiu,et al.  On the placement of Web server replicas , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[75]  Parameswaran Ramanathan,et al.  What do packet dispersion techniques measure? , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[76]  Hans-Werner Braun,et al.  The NLANR network analysis infrastructure , 2000, IEEE Commun. Mag..

[77]  Richard G. Baraniuk,et al.  SPATIO-TEMPORAL AVAILABLE BANDWIDTH ESTIMATION FOR HIGH-SPEED NETWORKS , 2003 .

[78]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[79]  Anees Shaikh,et al.  An empirical evaluation of wide-area internet bottlenecks , 2003, SIGMETRICS '03.

[80]  Vern Paxson,et al.  Experiences with NIMI , 2002, Proceedings 2002 Symposium on Applications and the Internet (SAINT) Workshops.

[81]  Konstantina Papagiannaki,et al.  Impact of flow dynamics on traffic engineering design principles , 2004, IEEE INFOCOM 2004.

[82]  M. Frans Kaashoek,et al.  A measurement study of available bandwidth estimation tools , 2003, IMC '03.

[83]  Manish Jain,et al.  End-to-end estimation of the available bandwidth variation range , 2005, SIGMETRICS '05.

[84]  Mario Gerla,et al.  CapProbe: a simple and accurate capacity estimation technique , 2004, SIGCOMM.

[85]  Donald F. Towsley,et al.  Network tomography on general topologies , 2002, SIGMETRICS '02.

[86]  Ming Zhang,et al.  PlanetSeer: Internet Path Failure Monitoring and Characterization in Wide-Area Services , 2004, OSDI.

[87]  Yuval Shavitt,et al.  Big-Bang simulation for embedding network distances in Euclidean space , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[88]  Peter Steenkiste,et al.  Improving TCP startup performance using active measurements: algorithm and evaluation , 2003, 11th IEEE International Conference on Network Protocols, 2003. Proceedings..

[89]  David Wetherall,et al.  Scriptroute: A Public Internet Measurement Facility , 2003, USENIX Symposium on Internet Technologies and Systems.

[90]  Al Morton,et al.  Standardized active measurements on a tier 1 IP backbone , 2003, IEEE Commun. Mag..

[91]  Kimberly C. Claffy,et al.  Comparison of Public End-to-End Bandwidth Estimation Tools on High-Speed Links , 2005, PAM.

[92]  Matthew Mathis,et al.  Automatic TCP buffer tuning , 1998, SIGCOMM '98.

[93]  Lixin Gao On inferring autonomous system relationships in the internet , 2001, TNET.