Real-time network data analysis using time series models

Abstract With the expansion of computer networks, there is a strong need for monitoring their properties in order to diagnose any problems and manage them in the best possible way. This monitoring is particularly useful if performed in real-time, however, such an approach is rather difficult (if not impossible) to implement in networks with increased traffic, using a passive monitoring scheme. One way to overcome this problem is to selectively sample network data, which in turn opens new issues such as how frequently this sampling should be performed, so as to obtain useful and exploitable data. In this work it is shown that it is possible to accurately represent high-speed network traffic using suitable time series models and then determine the size of the sampling window, so as to detect packet loss. The resulting scheme is scalable, protocol-independent and able to raise alerts in real time.

[1]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[2]  Anura P. Jayasumana,et al.  A Measurement-Based Modeling Approach for Network-Induced Packet Delay , 2007 .

[3]  J. Austin,et al.  A neural network for mining large volumes of time series data , 2005, 2005 IEEE International Conference on Industrial Technology.

[4]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[5]  Konstantina Papagiannaki,et al.  Long-term forecasting of Internet backbone traffic: observations and initial models , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[6]  Péter Benkö,et al.  A passive method for estimating end-to-end TCP packet loss , 2002, Global Telecommunications Conference, 2002. GLOBECOM '02. IEEE.

[7]  Ming-Syan Chen,et al.  Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging , 2005, IEEE Trans. Knowl. Data Eng..

[8]  H. Akaike A new look at the statistical model identification , 1974 .

[9]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[10]  Christophe Diot,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM.

[11]  V. Paxson,et al.  WHERE MATHEMATICS MEETS THE INTERNET , 1998 .

[12]  Khosrow Kaikhah,et al.  Discovering Trends in Large Datasets Using Neural Networks , 2006, Applied Intelligence.

[13]  Walter Willinger,et al.  Stochastic modeling of traffic processes , 1998 .

[14]  P. Diggle Time Series: A Biostatistical Introduction , 1990 .

[15]  Lei Shen,et al.  Prediction of Network Flow Based on Wavelet Analysis and ARIMA Model , 2009, 2009 International Conference on Wireless Networks and Information Systems.

[16]  Chris Chatfield,et al.  The Analysis of Time Series , 1990 .

[17]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[18]  Simon Haykin,et al.  Neural networks , 1994 .

[19]  William W. S. Wei,et al.  Time series analysis - univariate and multivariate methods , 1989 .

[20]  Walter Willinger,et al.  Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level , 1997, TNET.

[21]  T. Lai Time series analysis univariate and multivariate methods , 1991 .

[22]  Evangelos P. Markatos,et al.  Realistic Passive Packet Loss Measurement for High-Speed Networks , 2009, TMA.

[23]  Stefan Savage,et al.  Sting: A TCP-based Network Measurement Tool , 1999, USENIX Symposium on Internet Technologies and Systems.

[24]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[25]  Srinivasan Seshan,et al.  A network measurement architecture for adaptive applications , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[26]  Burkhard Stiller,et al.  DiCAP: Distributed Packet Capturing architecture for high-speed network links , 2008, 2008 33rd IEEE Conference on Local Computer Networks (LCN).

[27]  Inyoung Kim,et al.  A Bootstrap-based Simple Probability Model for Classifying Network Traffic and Detecting Network Intrusion , 2008 .