On Measurement and Analysis of Internet Backbone Traffic

In the last decade, the Internet emerged undoubtedly as the key component for commercial and personal communication. The success of the Internet is mainly based on its versatility and flexibility, allowing for the development of network applications ranging from simple text based utilities to complex systems for e-commerce and multi-media content. The ongoing expansion of the Internet is the cause of continuous unitlization and traffic behavior changes. Due to this diversity and the fast changing properties the Internet is a moving target. At present, the Internet is far from being well understood in its entirety. However, constantly changing Internet characteristics associated with both time and location make it imperative for the Internet community to understand the nature and behavior of current Internet traffic in order to support research and further development. Through the measurement and analysis of traffic the Internet can be better understood. This thesis presents a successful Internet measurement project, providing guidelines for conducting passive network measurements. Recent large-scale backbone traffic data is analyzed, revealing current deployment of protocol features on packet and flow level, including statistics about anomalies and misbehavior. A method to classify packet header data on transport level according to network application is proposed, resulting in a complete traffic decomposition. A comparison of the signaling behavior of the main traffic classes - Web, P2P, and malicious traffic - is presented. The results are significant because of the over-all impact of these traffic classes on Internet traffic behavior. The scale of the measurements allows to highlight longitudinal trends and changes in network application and protocol usage. Such findings support pro-active measures such as refinement of network design, provisioning, accounting and security measures. Finally, the analysis of data taken on vital Internet backbone links also provides valuable input for simulation models. By presenting a snapshot of current traffic composition and characteristics, this thesis contributes to a better understanding of how the Internet functions.

[1]  Ian Graham,et al.  Design principles for accurate passive measurement , 2000 .

[2]  Mark Allman,et al.  A web server's view of the transport layer , 2000, CCRV.

[3]  David Moore,et al.  Beyond folklore: observations on fragmented traffic , 2002, TNET.

[4]  Sally Floyd,et al.  Measuring the evolution of transport protocols in the internet , 2005, CCRV.

[5]  T. Wolf,et al.  An IXA-Based Network Measurement Node , 2004 .

[6]  Charles V. Wright,et al.  Playing Devil's Advocate: Inferring Sensitive Information from Anonymized Network Traces , 2007, NDSS.

[7]  Peter Phaal,et al.  InMon Corporation's sFlow: A Method for Monitoring Traffic in Switched and Routed Networks , 2001, RFC.

[8]  Vern Paxson,et al.  Strategies for sound internet measurement , 2004, IMC '04.

[9]  William Yurcik,et al.  Network Log Anonymization: Application of Crypto-PAn to Cisco Netflows , 2004 .

[10]  Anil Rijsinghani,et al.  Computation of the Internet Checksum via Incremental Update , 1994, RFC.

[11]  Craig Partridge,et al.  When the CRC and TCP checksum disagree , 2000, SIGCOMM.

[12]  Zhi-Li Zhang,et al.  Adaptive packet sampling for accurate and scalable flow measurement , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[13]  Yu Lin,et al.  A fuzzy-based algorithm to remove clock skew and reset from one-way delay measurement [Internet end-to-end performance measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[14]  Wolfgang John,et al.  Heuristics to Classify Internet Backbone Traffic based on Connection Patterns , 2008, 2008 International Conference on Information Networking.

[15]  Vern Paxson,et al.  On calibrating measurements of packet transit times , 1998, SIGMETRICS '98/PERFORMANCE '98.

[16]  Cathy H. Xia,et al.  Clock synchronization algorithms for network measurements , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[17]  V. Paxson,et al.  Growth trends in wide-area TCP connections , 1994, IEEE Network.

[18]  Matthew Roughan,et al.  P2P the gorilla in the cable , 2003 .

[19]  Eddie Kohler,et al.  Internet research needs better models , 2003, CCRV.

[20]  Kimberly C. Claffy "A day in the life of the internet": proposed community-wide experiment , 2006, CCRV.

[21]  Jean-Charles Grégoire,et al.  Low-complexity offline and online clock skew estimation and removal , 2006, Comput. Networks.

[22]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[23]  Carey L. Williamson,et al.  An analysis of TCP reset behaviour on the internet , 2005, CCRV.

[24]  Patrick Brown,et al.  Analysis of Peer-to-Peer Traffic on ADSL , 2005, PAM.

[25]  K. Claffy,et al.  Trends in wide area IP traffic patterns - A view from Ames Internet Exchange , 2000 .

[26]  R. Rockwell,et al.  Sharing and archiving data is fundamental to scientific progress. , 1998, The journals of gerontology. Series B, Psychological sciences and social sciences.

[27]  Nick G. Duffield,et al.  Sampling and Filtering Techniques for IP Packet Selection , 2009, RFC.

[28]  kc claffy,et al.  Understanding Internet traffic streams: dragonflies and tortoises , 2002, IEEE Commun. Mag..

[29]  Donald F. Towsley,et al.  Estimation and removal of clock skew from network delay measurements , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[30]  Dirk Grunwald,et al.  Legal issues surrounding monitoring during network research , 2007, IMC '07.

[31]  A. Householder,et al.  Computer attack trends challenge Internet security , 2002 .

[32]  Carsten Lund,et al.  Properties and prediction of flow statistics from sampled packet streams , 2002, IMW '02.

[33]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[34]  Evangelos P. Markatos,et al.  A Generic Anonymization Framework for Network Traffic , 2006, 2006 IEEE International Conference on Communications.

[35]  Andrew W. Moore,et al.  Architecture of a network monitor , 2003 .

[36]  Alefiya Hussain,et al.  Effect of Malicious Traffic on the Network , 2003 .

[37]  John S. Heidemann,et al.  Experiences with a continuous network tracing infrastructure , 2005, MineNet '05.

[38]  George Varghese,et al.  New directions in traffic measurement and accounting , 2002, CCRV.

[39]  R. Emardson,et al.  Utilizing an Active Fiber Optic Communication Network for Accurate Time Distribution , 2007, 2007 9th International Conference on Transparent Optical Networks.

[40]  Benoit Claise,et al.  Cisco Systems NetFlow Services Export Version 9 , 2004, RFC.

[41]  Mark Allman,et al.  A Scalable System for Sharing Internet Measurements , 2007 .

[42]  Shigeki Goto,et al.  Identifying elephant flows through periodically sampled packets , 2004, IMC '04.

[43]  kc claffy,et al.  Longitudinal study of Internet traffic in 1998-2003 , 2004 .

[44]  George C. Polyzos,et al.  A Parameterizable Methodology for Internet Traffic Flow Profiling , 1995, IEEE J. Sel. Areas Commun..

[45]  Chase Cotton,et al.  Packet-level traffic measurements from the Sprint IP backbone , 2003, IEEE Netw..

[46]  Carsten Lund,et al.  Algorithms and estimators for accurate summarization of internet traffic , 2007, IMC '07.

[47]  Robert T. Braden,et al.  Requirements for Internet Hosts - Communication Layers , 1989, RFC.

[48]  André Årnes,et al.  Circumventing IP-address pseudonymization , 2005, Communications and Computer Networks.

[49]  Kostas Pentikousis,et al.  Quantifying the deployment of TCP options - a comparative study , 2004, IEEE Communications Letters.

[50]  Nevil Brownlee,et al.  Internet Measurement , 2004, IEEE Internet Comput..

[51]  Michalis Faloutsos,et al.  File-sharing in the Internet: A characterization of P2P traffic in the backbone , 2003 .

[52]  Darryl Veitch,et al.  PC based precision timing without GPS , 2002, SIGMETRICS '02.

[53]  R. Wilder,et al.  Wide-area Internet traffic patterns and characteristics , 1997, IEEE Netw..

[54]  Sándor Molnár,et al.  Identification and Analysis of Peer-to-Peer Traffic , 2006, J. Commun..

[55]  Jason Lee,et al.  The devil and packet trace anonymization , 2006, CCRV.

[56]  Technical Whitepaper,et al.  SLIPPING IN THE WINDOW: TCP RESET ATTACKS , 2003 .

[57]  Junfeng Wang,et al.  Clock synchronization for Internet measurements: a clustering algorithm , 2004, Comput. Networks.

[58]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[59]  Kun-Chan Lan,et al.  A measurement study of correlations of Internet flow characteristics , 2006, Comput. Networks.

[60]  Marco Mellia,et al.  Measuring IP and TCP behavior on edge nodes with Tstat , 2005, Comput. Networks.

[61]  kc claffy,et al.  The architecture of CoralReef: an Internet traffic monitoring software suite , 2001 .

[62]  Mostafa H. Ammar,et al.  On the design and performance of prefix-preserving IP traffic trace anonymization , 2001, IMW '01.

[63]  Yin Zhang,et al.  On the characteristics and origins of internet flow rates , 2002, SIGCOMM '02.

[64]  Mostafa H. Ammar,et al.  Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme , 2004, Comput. Networks.

[65]  Steven McCanne,et al.  The BSD Packet Filter: A New Architecture for User-level Packet Capture , 1993, USENIX Winter.

[66]  Kimberly C. Claffy,et al.  OC3MON: Flexible, Affordable, High Performance Staistics Collection , 1996, LISA.

[67]  Ian Graham,et al.  Precision timestamping of network packets , 2001, IMW '01.

[68]  David L. Mills,et al.  Network Time Protocol (Version 3) Specification, Implementation and Analysis , 1992, RFC.

[69]  Wolfgang John,et al.  Analysis of internet backbone traffic and header anomalies observed , 2007, IMC '07.

[70]  Vern Paxson,et al.  Issues and etiquette concerning use of shared measurement data , 2007, IMC '07.

[71]  T. Kohno,et al.  Remote physical device fingerprinting , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[72]  Jeffrey C. Mogul Trace anonymization misses the point , 2002, WWW 2002.

[73]  David Moore,et al.  The internet measurement data catalog , 2005, CCRV.

[74]  James Won-Ki Hong,et al.  Characteristic analysis of internet traffic from the perspective of flows , 2006, Comput. Commun..

[75]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2002, IMW '02.

[76]  Richard Nelson,et al.  Analysis of long duration traces , 2005, CCRV.

[77]  Shigeki Goto,et al.  Flow analysis of internet traffic: World Wide Web versus peer-to-peer , 2005, Systems and Computers in Japan.

[78]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[79]  Jeffrey D. Case,et al.  Simple Network Management Protocol (SNMP) , 1989, RFC.

[80]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[81]  Kun-Chan Lan,et al.  Rapid model parameterization from traffic measurements , 2002, TOMC.

[82]  Krishna P. Gummadi,et al.  An analysis of Internet content delivery systems , 2002, OPSR.