Structural analysis of network traffic flows

Network traffic arises from the superposition of Origin-Destination (OD) flows. Hence, a thorough understanding of OD flows is essential for modeling network traffic, and for addressing a wide variety of problems including traffic engineering, traffic matrix estimation, capacity planning, forecasting and anomaly detection. However, to date, OD flows have not been closely studied, and there is very little known about their properties.We present the first analysis of complete sets of OD flow time-series, taken from two different backbone networks (Abilene and Sprint-Europe). Using Principal Component Analysis (PCA), we find that the set of OD flows has small intrinsic dimension. In fact, even in a network with over a hundred OD flows, these flows can be accurately modeled in time using a small number (10 or less) of independent components or dimensions.We also show how to use PCA to systematically decompose the structure of OD flow timeseries into three main constituents: common periodic trends, short-lived bursts, and noise. We provide insight into how the various constitutents contribute to the overall structure of OD flows and explore the extent to which this decomposition varies over time.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  R. Preisendorfer,et al.  Principal Component Analysis in Meteorology and Oceanography , 1988 .

[3]  D. Ts'o,et al.  Functional organization of primate visual cortex revealed by high resolution optical imaging. , 1990, Science.

[4]  L. Sirovich,et al.  Plane waves and structures in turbulent channel flow , 1990 .

[5]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[7]  Murad S. Taqqu,et al.  On the Self-Similar Nature of Ethernet Traffic , 1993, SIGCOMM.

[8]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[9]  V. Paxson,et al.  Wide-area traffic: the failure of Poisson modeling , 1994, SIGCOMM.

[10]  Y. Vardi,et al.  Network Tomography: Estimating Source-Destination Traffic Intensities from Link Data , 1996 .

[11]  Michael A. West,et al.  Bayesian Inference on Network Traffic Using Link Count Data , 1998 .

[12]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[13]  Jake D. Brutlag,et al.  Aberrant Behavior Detection in Time Series for Network Monitoring , 2000, LISA.

[14]  Anja Feldmann,et al.  Deriving traffic demands for operational IP networks: methodology and experience , 2000, SIGCOMM.

[15]  Christophe Diot,et al.  Pop-level and access-link-level traffic dynamics in a tier-1 POP , 2001, IMW '01.

[16]  Richard G. Baraniuk,et al.  Connection-level analysis and modeling of network traffic , 2001, IMW '01.

[17]  Konstantina Papagiannaki,et al.  A pragmatic definition of elephants in internet backbone traffic , 2002, IMW '02.

[18]  Christophe Diot,et al.  Traffic matrix estimation: existing techniques and new directions , 2002, SIGCOMM 2002.

[19]  Kavé Salamatian,et al.  Traffic matrix estimation: existing techniques and new directions , 2002, SIGCOMM '02.

[20]  Matthew Roughan,et al.  Large-scale measurement and modeling of backbone Internet traffic , 2002, SPIE ITCom.

[21]  Albert G. Greenberg,et al.  Experience in measuring backbone traffic variability: models, metrics, measurements and meaning , 2002, IMW '02.

[22]  Paul Barford,et al.  A signal analysis of network traffic anomalies , 2002, IMW '02.

[23]  Albert G. Greenberg,et al.  Fast accurate computation of large-scale IP traffic matrices from link loads , 2003, SIGMETRICS '03.

[24]  Konstantina Papagiannaki,et al.  Analysis of OD Flows (Raw Data) , 2003 .

[25]  Konstantina Papagiannaki,et al.  Long-term forecasting of Internet backbone traffic: observations and initial models , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[26]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[27]  Vinod Yegneswaran,et al.  Internet intrusions: global characteristics and prevalence , 2003, SIGMETRICS '03.

[28]  Anukool Lakhina,et al.  Analysis of Origin Destination Flows (Raw Data) , 2003 .

[29]  Carsten Lund,et al.  An information-theoretic approach to traffic matrix estimation , 2003, SIGCOMM '03.

[30]  Mark Crovella,et al.  Graph wavelets for spatial traffic analysis , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[31]  Matthew Roughan,et al.  Experience in measuring internet backbone traffic variability: Models metrics, measurements and meaning , 2003 .

[32]  Emilio Leonardi,et al.  How to identify and estimate the largest traffic matrix elements in a dynamic environment , 2004, SIGMETRICS '04/Performance '04.

[33]  Christophe Diot,et al.  Design of IGP link weight changes for estimation of traffic matrices , 2004, IEEE INFOCOM 2004.

[34]  Konstantina Papagiannaki,et al.  Impact of flow dynamics on traffic engineering design principles , 2004, IEEE INFOCOM 2004.

[35]  Carsten Lund,et al.  Estimating flow distributions from sampled flow statistics , 2005, TNET.

[36]  Nicolas Hohn,et al.  Inverting sampled traffic , 2003, IEEE/ACM Transactions on Networking.