Internet traffic characterization

Traffic statistics normally collected during day-to-day operation of wide-area datagram networks are frequently insufficient for researchers to use in studying the workloads and performance of these realistic environments. As wide-area networks become more ubiquitous and service expectations rise, current methods for collecting data will become even less suitable. We examine ways to improve techniques for statistics collection so that the resulting data will enable researchers, and indeed service providers themselves, to develop more accurate Internet traffic models. We first provide a taxonomy of traffic characterization tasks. We then use operationally collected statistics to characterize traffic of the T1 and T3 NSFNET backbones. Because current infrastructure statistics collection is oriented toward either short term operational requirements or periodic simplistic traffic reports to funding agencies, this data is often not conducive to assessing network workload or performance; we evaluate to what extent they are useful for tasks in the taxonomy, and propose improvements in current statistics collection architectures, with particular application to the NSFNET backbone. We include an investigation of the effects of sampling to characterize traffic and evaluate performance in a high-speed wide-area network environment. In the second part of the thesis we focus on items in the outlined taxonomy that are not conducive to investigation using operationally collected statistics. These items mostly involve short-term aspects of Internet flows, which operationally collected statistics fail to expose. We develop a general methodology for use in assessing Internet flow profiles and their impact on an aggregate Internet workload. Our methodology for profiling flows differs from many previous studies that have concentrated on end-point definitions of flows defined by TCP connections using the TCP SYN and FIN control mechanism. We focus on the IP layer and define flows based on traffic satisfying various temporal and spatial locality conditions, as observed at internal points of the network. We first define the parameter space and then concentrate on metrics characterizing both individual flows and the aggregate flow. Metrics of individual flows include: volume in packets and bytes per flow, and flow duration. Metrics of the aggregate flow, or workload characteristics from the network perspective, include: counts of the number of active, new, and timed out flows per time interval; flow interarrival and arrival processes; and flow locality metrics. Applying the methodology to our measurements yields significant observations of the Internet infrastructure, which have implications for performance requirements of routers at Internet hotspots, general and specialized flow-based routing algorithms, future usage-based accounting requirements, and traffic prioritization. Finally, we discuss trends that will affect how Internet service providers collect statistics in the future. Improvements in operational statistics collection, such as support for flow assessment, will help networking activities along various time horizons, from defining service quality patterns to long-term capacity planning. We offer a unique combination of operational and research perspectives, allowing us to reduce the gaps among (1) what network service providers need; (2) what statistics service providers can provide; and (3) what network analysis requires.

[1]  Roy H. Campbell,et al.  Internet protocol traffic analysis with applications for ATM switch design , 1993, CCRV.

[2]  T. W. Anderson,et al.  Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[3]  Hans-Werner Braun,et al.  The National Science Foundation Network , 1992 .

[4]  D ClarkDavid The design philosophy of the DARPA Internet Protocols , 1995 .

[5]  Deborah Estrin,et al.  Hybrid technique for simulating high bandwidth delay computer networks , 1993, SIGMETRICS '93.

[6]  Riccardo Gusella,et al.  A measurement study of diskless workstation traffic on an Ethernet , 1990, IEEE Trans. Commun..

[7]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[8]  Deborah Estrin,et al.  An architectural comparison of ST-II and RSVP , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[9]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[10]  Michael R. Macedonia,et al.  MBone provides audio and video across the Internet , 1994, Computer.

[11]  Marshall T. Rose,et al.  Management Information Base for network management of TCP/IP-based internets , 1990, RFC.

[12]  Sally Floyd,et al.  Wide-Area Traffic: The Failure of Poisson Modeling , 1994, SIGCOMM.

[13]  George C. Polyzos,et al.  A framework for flow-based accounting on the Internet , 1993, Proceedings of IEEE Singapore International Conference on Networks/International Conference on Information Engineering '93.

[14]  Dinesh C. Verma,et al.  A Scheme for Real-Time Channel Establishment in Wide-Area Networks , 1990, IEEE J. Sel. Areas Commun..

[15]  Vern Paxson,et al.  Empirically derived analytic models of wide-area TCP connections , 1994, TNET.

[16]  George C. Polyzos,et al.  Traffic characteristics of the T1 NSFNET backbone , 1993, IEEE INFOCOM '93 The Conference on Computer Communications, Proceedings.

[17]  Van Jacobson,et al.  The synchronization of periodic routing messages , 1993, SIGCOMM '93.

[18]  Srinivasan Keshav,et al.  An empirical evaluation of virtual circuit holding times in IP-over-ATM networks , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[19]  C. Parris,et al.  A RESOURCE BASED PRICING POLICY FOR REAL-TIME CHANNELS IN A PACKET-SWITCHING NETWORK , 1992 .

[20]  Conclusions , 1989 .

[21]  Deborah Estrin,et al.  Visa protocols for controlling interorganizational datagram flow , 1989, IEEE J. Sel. Areas Commun..

[22]  Leonard Kleinrock,et al.  On measured behavior of the ARPA network , 1899, AFIPS '74.

[23]  Ronald W. Wolff,et al.  Poisson Arrivals See Time Averages , 1982, Oper. Res..

[24]  Jeffrey C. Mogul,et al.  Observing TCP dynamics in real networks , 1992, SIGCOMM '92.

[25]  Ronald Paul Cocchi Pricing in multiple service class computer communications networks , 1992 .

[26]  Kimberly C. Claffy,et al.  Mitigating the Coming Internet Crunch: Multiple Service Levels via Precedence , 1994, J. High Speed Networks.

[27]  Deborah Estrin,et al.  An assessment of state and lookup overhead in routers , 1992, [Proceedings] IEEE INFOCOM '92: The Conference on Computer Communications.

[28]  G. Pacifici,et al.  Control of resources in broadband networks with quality of service guarantees , 1991, IEEE Communications Magazine.

[29]  Edward D. Lazowska,et al.  Quantitative System Performance , 1985, Int. CMG Conference.

[30]  Jakob Rekhter,et al.  NSFNET backbone SPF based Interior Gateway Protocol , 1988, RFC.

[31]  Peter B. Danzig,et al.  Characteristics of wide-area TCP/IP conversations , 1991, SIGCOMM '91.

[32]  Michael H. Lambert A Model for Common Operational Statistics , 1995, RFC.

[33]  Robert H. Stine FYI on a Network Management Tool Catalog: Tools for Monitoring and Debugging TCP/IP Internets and Interconnected Devices , 1990, RFC.

[34]  Jeffrey C. Mogul,et al.  Measured capacity of an Ethernet: myths and reality , 1988, CCRV.

[35]  G. J. A. Stern,et al.  Queueing Systems, Volume 2: Computer Applications , 1976 .

[36]  Yakov Rekhter,et al.  An Architecture for IP Address Allocation with CIDR , 1993, RFC.

[37]  Ashok K. Agrawala,et al.  Network dynamics: an experimental study of the Internet , 1992, [Conference Record] GLOBECOM '92 - Communications for Global Users: IEEE.

[38]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[39]  Lixia Zhang,et al.  VirtualClock: a new traffic control algorithm for packet-switched networks , 1991, TOCS.

[40]  D. Estrin,et al.  RSVP: a new resource reservation protocol , 1993, IEEE Communications Magazine.

[41]  Claudio Topolcic,et al.  Experimental Internet Stream Protocol: Version 2 (ST-II) , 1990, RFC.

[42]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[43]  Calyampudi R. Rao Handbook of statistics , 1980 .

[44]  Jon Crowcroft,et al.  Eliminating periodic packet losses in the 4.3-Tahoe BSD TCP congestion control algorithm , 1992, CCRV.

[45]  Andrew G. Malis,et al.  ATM Signaling Support for IP over ATM , 1995, RFC.

[46]  Amarnath Mukherjee,et al.  On the Dynamics and Significance of Low Frequency Components of Internet Load , 1992 .

[47]  Yakov Rekhter Forwarding database overhead for inter-domain routing , 1993, CCRV.

[48]  David M. Lucantoni,et al.  A Markov Modulated Characterization of Packetized Voice and Data Traffic and Related Statistical Multiplexer Performance , 1986, IEEE J. Sel. Areas Commun..

[49]  Scott Shenker,et al.  Supporting real-time applications in an Integrated Services Packet Network: architecture and mechanism , 1992, SIGCOMM '92.

[50]  Martha Steenstrup,et al.  An Architecture for Inter-Domain Policy Routing , 1993, RFC.

[51]  D. C. Feldmeier,et al.  Improving gateway performance with a routing-table cache , 1988, IEEE INFOCOM '88,Seventh Annual Joint Conference of the IEEE Computer and Communcations Societies. Networks: Evolution or Revolution?.

[52]  Mark Laubach,et al.  Classical IP and ARP over ATM , 1994, RFC.

[53]  George C. Polyzos,et al.  Measurement Considerations for Assessing Unidirectional Latencies � , 1993 .

[54]  Yakov Rekhter,et al.  Injecting inter-autonomous system routes into intra-autonomous system routing: a performance analysis , 1992, CCRV.

[55]  Deborah Estrin,et al.  Scalable Inter-Domain Routing Architecture , 1992, SIGCOMM.

[56]  D. Mitra,et al.  Stochastic theory of a data-handling system with multiple sources , 1982, The Bell System Technical Journal.

[57]  Deborah Estrin,et al.  A study of priority pricing in multiple service class networks , 1991, SIGCOMM '91.

[58]  Jeffrey C. Mogul,et al.  Network locality at the scale of processes , 1991, SIGCOMM '91.

[59]  Kenneth G. Madden,et al.  Public Access to the Internet , 1996, Inf. Process. Manag..

[60]  Jun Murai,et al.  An analysis of international academic research network traffic between Japan and other nations , 1992 .

[61]  Leonard Kleinrock,et al.  Queueing Systems: Volume I-Theory , 1975 .

[62]  George C. Polyzos,et al.  Application of sampling methodologies to network traffic characterization , 1993, SIGCOMM '93.

[63]  Deborah Estrin,et al.  Limited Distribution Updates to Reduce Overhead in Adaptive Internetwork Routing , 1993 .

[64]  Dave Katz,et al.  Administrative Domains and Routing Domains: A model for routing in the Internet , 1989, RFC.

[65]  Raj Jain,et al.  Characteristics of Destination Address Locality in Computer Networks: A Comparison of Caching Schemes , 1990, Comput. Networks ISDN Syst..

[66]  Peter B. Danzig,et al.  Internet resource discovery services , 1993, Computer.

[67]  PaxsonVern Empirically derived analytic models of wide-area TCP connections , 1994 .

[68]  Ashok K. Agrawala,et al.  Experimental assessment of end-to-end behavior on Internet , 1993, IEEE INFOCOM '93 The Conference on Computer Communications, Proceedings.

[69]  Vladislav Rutenburg,et al.  Fair charging policies and minimum-expected-cost routing in internets with packet loss , 1991, IEEE INFCOM '91. The conference on Computer Communications. Tenth Annual Joint Comference of the IEEE Computer and Communications Societies Proceedings.

[70]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[71]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM '89.

[72]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[73]  Scott Shenker,et al.  Observations on the dynamics of a congestion control algorithm: the effects of two-way traffic , 1991, SIGCOMM '91.

[74]  WillingerWalter,et al.  On the self-similar nature of Ethernet traffic , 1993 .

[75]  Brijesh Kumar Effect of packet losses on end-user cost in internetworks with usage based charging , 1993, CCRV.

[76]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[77]  Elise Gerich Guidelines for Management of IP Address Space , 1993, RFC.

[78]  Steven A. Heimlich Traffic characterization of the NSFNET national backbone , 1990, SIGMETRICS '90.