An adjusted ARIMA model for internet traffic

Traditional time series models such as ARIMA models have been proven to be inadequate for modelling traffic exhibiting long-range dependance. In this paper we present a new model the adjusted ARIMA model for modelling long-range dependant Internet traffic. The AARIMA model is suggested to give a quick and simple way to model Internet traffic by retaining all the properties of the ARIMA models while capturing the self- similarity. We use the Box-Jenkins methodology as a frame work for our modelling procedure. We construct our model by building the best ARIMA model possible for a trace and then adding our adjustment to obtain the equivalent AARIMA model. We show that the AARIMA model shows an evident improvement over the ARIMA model using several goodness of fit criteria, our main goodness of fit criteria is the ability to capture the Hurst parameter of the original trace being modeled. The model should not underestimate the Hurst parameter and any overestimation should be less than or equal to 20% of H parameter of the measured trace. The adjusted ARIMA model is shown to accurately predict Internet traffic for up to one hour in advance. The adjustment we propose to the ARIMA model is by introducing a feedback term made up of the first difference of the series being modeled. We used four Hurst parameter estimators to measure the self- similarity of the measured traces and both the AARIMA and ARIMA models for all measured traces. For all the estimators used the AARIMA was found to capture the long-range dependence irrespective of estimator used. We used the adjusted ARIMA model to predict three public domain internet traffic traces namely a Bellcore internet wide area network external traffic trace (length 35 hours), a Bellcore Internet Wide Area Network "purple cable" trace (length half an hour) and a MPEG-1 compressed video traffic trace (length half an hour). We show that for the public domain traces the AARIMA model gives values of H parameter which are more accurate than those given by the ARIMA model.

[1]  Patrice Abry,et al.  Wavelet Analysis of Long-Range-Dependent Traffic , 1998, IEEE Trans. Inf. Theory.

[2]  Mark Crovella,et al.  Explaining world wide web self-similarity , 1995 .

[3]  M. Crovella,et al.  Heavy-tailed probability distributions in the World Wide Web , 1998 .

[4]  Kavitha Chandra,et al.  Time series models for Internet data traffic , 1999, Proceedings 24th Conference on Local Computer Networks. LCN'99.

[5]  Michalis Faloutsos,et al.  A user-friendly self-similarity analysis tool , 2003, CCRV.

[6]  Charles Thompson,et al.  Non-Linear Time-Series Models of Ethernet Traffic , 1998 .

[7]  Arne F. Jacob,et al.  Wavelets for the Analysis of Microstrip Lines , 1995 .

[8]  Will E. Leland,et al.  High time-resolution measurement and analysis of LAN traffic: Implications for LAN interconnection , 1991, IEEE INFCOM '91. The conference on Computer Communications. Tenth Annual Joint Comference of the IEEE Computer and Communications Societies Proceedings.

[9]  Walter Willinger,et al.  Self-similarity and heavy tails: structural modeling of network traffic , 1998 .

[10]  Abdelnaser Mohammad Adas,et al.  Traffic Models in Broadband Telecommunication Networks , 1996 .

[11]  Howell Tong,et al.  Non-Linear Time Series , 1990 .

[12]  Amarnath Mukherjee,et al.  Time series models for internet traffic , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[13]  Maurice G. Kendall,et al.  Time-Series. 2nd edn. , 1976 .

[14]  Patrice Abry,et al.  A Wavelet-Based Joint Estimator of the Parameters of Long-Range Dependence , 1999, IEEE Trans. Inf. Theory.

[15]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[16]  Oliver Rose,et al.  Statistical properties of MPEG video traffic and their impact on traffic modeling in ATM systems , 1995, Proceedings of 20th Conference on Local Computer Networks.

[17]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1995, CCRV.

[18]  L. Oxley,et al.  Estimators for Long Range Dependence: An Empirical Study , 2009, 0901.0762.

[19]  Murad S. Taqqu,et al.  On estimating the intensity of long-range dependence in finite and infinite variance time series , 1998 .

[20]  P. Diggle Time Series: A Biostatistical Introduction , 1990 .

[21]  W. Willinger,et al.  ESTIMATORS FOR LONG-RANGE DEPENDENCE: AN EMPIRICAL STUDY , 1995 .

[22]  Yong Zeng,et al.  ARCH-Based Traffic Forecasting and Dynamic Bandwidth Provisioning for Periodically Measured Nonstationary Traffic , 2007, IEEE/ACM Transactions on Networking.

[23]  H. Tong Non-linear time series. A dynamical system approach , 1990 .

[24]  Fei Xue,et al.  Traffic modeling based on FARIMA models , 1999, Engineering Solutions for the Next Millennium. 1999 IEEE Canadian Conference on Electrical and Computer Engineering (Cat. No.99TH8411).

[25]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[26]  Konstantina Papagiannaki,et al.  Long-term forecasting of Internet backbone traffic: observations and initial models , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[27]  Walter Willinger,et al.  Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level , 1997, TNET.

[28]  Chris Chatfield The Analysis of Time Series: Theory and Practice , 1975 .

[29]  Patrice Abry,et al.  Meaningful MRA initialization for discrete time series , 2000, Signal Process..

[30]  C. Peng,et al.  Mosaic organization of DNA nucleotides. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[31]  Oliver W. W. Yang,et al.  Traffic prediction using FARIMA models , 1999, 1999 IEEE International Conference on Communications (Cat. No. 99CH36311).

[32]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Adrian Popescu,et al.  Traffic Self-Similarity , 2001 .

[34]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[35]  Walter Willinger,et al.  Long-Range Dependence and Data Network Traffic , 2001 .

[36]  George C. Polyzos,et al.  A time series model of long-term NSFNET backbone traffic , 1994, Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications.

[37]  Jeff Dean,et al.  Time Series , 2009, Encyclopedia of Database Systems.

[38]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[39]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[40]  Piet Demeester,et al.  Pan-European optical networking using wavelength division multiplexing , 1997, IEEE Commun. Mag..

[41]  Patrice Abry,et al.  Wavelets for the Analysis, Estimation, and Synthesis of Scaling Data , 2002 .

[42]  Patrice Abry,et al.  Long‐range Dependence: Revisiting Aggregation with Wavelets , 1998 .

[43]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .