Inter-data-center network traffic prediction with elephant flows

With the ever increasing number of large scale Internet applications, inter data center (inter-DC) data transfers are becoming more and more common. Traditional inter-DC transfers suffers from both low-utilization and congestion, and traffic prediction is an important method to optimize these transfers. Inter-DC traffic is harder to predict than many other types of network traffic, because it is dominated by a few large applications. We propose a model that significantly reduces the prediction errors. In our model, we combine wavelet transform with artificial neural network (ANN) to improve prediction accuracy. Specifically, we explicitly add information of elephant flows, the least predictable yet dominating traffic in inter-DC network, into our prediction model. To reduce the amount of monitoring overhead for the elephant flow information, we added interpolation to fill in the unknown values in the elephant flows. We demonstrate that we can reduce prediction errors over existing methods by 5%~10%. Our prediction is already in production at Baidu, one of the largest Internet companies in China, helping reducing the peak network bandwidth.

[1]  Bo Zhou,et al.  Network Traffic Modeling and Prediction with ARIMA / GARCH , 2005 .

[2]  Konstantina Papagiannaki,et al.  A pragmatic definition of elephants in internet backbone traffic , 2002, IMW '02.

[3]  H. He,et al.  A self-organizing learning array system for power quality classification based on wavelet transform , 2006, IEEE Transactions on Power Delivery.

[4]  Bin Ran,et al.  Fuzzy-Neural Network Traffic Prediction Framework with Wavelet Decomposition , 2003 .

[5]  Jingbo Xia,et al.  Network traffic forecasting by support vector machines based on empirical mode decomposition denoising , 2012, 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet).

[6]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[7]  Maode Ma,et al.  SVM-Based Models for Predicting WLAN Traffic , 2006, 2006 IEEE International Conference on Communications.

[8]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[9]  Thong Ngee Goh,et al.  A comparative study of neural network and Box-Jenkins ARIMA modeling in time series prediction , 2002 .

[10]  Peng Feng,et al.  Network traffic prediction algorithm research based on PSO-BP neural network , 2015, ICIS 2015.

[11]  Wucherl Yoo,et al.  Network bandwidth utilization forecast model on high bandwidth networks , 2015, 2015 International Conference on Computing, Networking and Communications (ICNC).

[12]  Yantai Shu,et al.  Study on network traffic prediction techniques , 2005, Proceedings. 2005 International Conference on Wireless Communications, Networking and Mobile Computing, 2005..

[13]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[14]  Alain Leprêtre,et al.  A comparison of species diversity estimators , 1999, Researches on Population Ecology.

[15]  Shigeki Goto,et al.  On the characteristics of Internet traffic variability: spikes and elephants , 2004, 2004 International Symposium on Applications and the Internet. Proceedings..

[16]  C. Burrus,et al.  Introduction to Wavelets and Wavelet Transforms: A Primer , 1997 .

[17]  Yong Qu,et al.  Network Traffic Prediction Algorithm based on Wavelet Transform , 2013 .

[18]  J. Contreras,et al.  ARIMA Models to Predict Next-Day Electricity Prices , 2002, IEEE Power Engineering Review.

[19]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[20]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[21]  Huang Yalou Nonlinear network traffic prediction based on BP neural network , 2007 .

[22]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[23]  Clive W. J. Granger,et al.  An introduction to long-memory time series models and fractional differencing , 2001 .

[24]  V. Alarcon-Aquino,et al.  Multiresolution FIR neural-network-based learning algorithm applied to network traffic prediction , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Norden E. Huang,et al.  Ensemble Empirical Mode Decomposition: a Noise-Assisted Data Analysis Method , 2009, Adv. Data Sci. Adapt. Anal..

[26]  Richard G. Baraniuk,et al.  A Multifractal Wavelet Model with Application to Network Traffic , 1999, IEEE Trans. Inf. Theory.

[27]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[28]  V. Sumathy,et al.  S-ARMA MODEL FOR NETWORK TRAFFIC PREDICTION IN WIRELESS SENSOR NETWORKS , 2014 .

[29]  Michael Y. Hu,et al.  Forecasting with artificial neural networks : The state of the art * , 1997 .

[30]  Srikanth Kandula,et al.  Achieving high utilization with software-driven WAN , 2013, SIGCOMM.

[31]  Xing-an Fu,et al.  A network traffic prediction model based on recurrent wavelet neural network , 2012, Proceedings of 2012 2nd International Conference on Computer Science and Network Technology.

[32]  Alex Sim,et al.  Estimating and Forecasting Network Traffic Performance Based on Statistical Patterns Observed in SNMP Data , 2013, MLDM.

[33]  Yin Zhang,et al.  COPE: traffic engineering in dynamic networks , 2006, SIGCOMM 2006.

[34]  Hong Zhao,et al.  Wavelet Transform-based Network Traffic Prediction: A Fast On-line Approach , 2012, J. Comput. Inf. Technol..

[35]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[36]  K. Becker,et al.  Analysis of microarray data using Z score transformation. , 2003, The Journal of molecular diagnostics : JMD.

[37]  J. Torres,et al.  Forecast of hourly average wind speed with ARMA models in Navarre (Spain) , 2005 .

[38]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .