X-Trace: A Pervasive Network Tracing Framework

Modern Internet systems often combine different applications (e.g., DNS, web, and database), span different administrative domains, and function in the context of network mechanisms like tunnels, VPNs, NATs, and overlays. Diagnosing these complex systems is a daunting challenge. Although many diagnostic tools exist, they are typically designed for a specific layer (e.g., traceroute) or application, and there is currently no tool for reconstructing a comprehensive view of service behavior. In this paper we propose X-Trace, a tracing framework that provides such a comprehensive view for systems that adopt it. We have implemented X-Trace in several protocols and software systems, and we discuss how it works in three deployed scenarios: DNS resolution, a three-tiered photo-hosting website, and a service accessed through an overlay network.

[1]  D. H. Crocker,et al.  Standard for the format of arpa intemet text messages , 1982 .

[2]  Jeffrey D. Case,et al.  Simple Network Management Protocol (SNMP) , 1989, RFC.

[3]  H. Balakrishnan,et al.  A comparison of mechanisms for improving TCP performance over wireless links , 1999, SIGCOMM '96.

[4]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[5]  Paul Vixie,et al.  Extension Mechanisms for DNS (EDNS0) , 1999, RFC.

[6]  David Mazières,et al.  A Toolkit for User-Level File Systems , 2001, USENIX Annual Technical Conference, General Track.

[7]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[8]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[9]  J. Rosenberg,et al.  Session Initiation Protocol , 2002 .

[10]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[11]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[12]  Srinivasan Seshan,et al.  A case for end system multicast , 2002, IEEE J. Sel. Areas Commun..

[13]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.

[14]  Scott Shenker,et al.  Internet indirection infrastructure , 2004, TNET.

[15]  I. Stoica,et al.  Internet indirection infrastructure , 2002, SIGCOMM '02.

[16]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[17]  Kang G. Shin,et al.  Stateful distributed interposition , 2004, TOCS.

[18]  Alan L. Cox,et al.  Causeway: Support for Controlling and Analyzing the Execution of Multi-tier Applications , 2005, Middleware.

[19]  John S. Heidemann,et al.  Experiences with a continuous network tracing infrastructure , 2005, MineNet '05.

[20]  Randy H. Katz,et al.  IP Options are not an option , 2005 .

[21]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[22]  J. Rexford,et al.  Cross-layer Visibility as a Service , 2005 .

[23]  Scott Shenker,et al.  Replay debugging for distributed applications , 2006 .

[24]  Paramvir Bahl,et al.  Discovering Dependencies for Network Management , 2006, HotNets.

[25]  Amin Vahdat,et al.  Pip: Detecting the Unexpected in Distributed Systems , 2006, NSDI.

[26]  I. Stoica,et al.  A data-oriented (and beyond) network architecture , 2007, SIGCOMM '07.

[27]  Scott Shenker,et al.  A data-oriented (and beyond) network architecture , 2007, SIGCOMM 2007.