A Hierarchical Framework for Modeling and Forecasting Web Server Workload

Proactive management of web server farms requires accurate prediction of workload. An exemplary measure of workload is the amount of service requests per unit time. As a time series, the workload exhibits not only short-term random fluctuations, but also prominent periodic (daily) patterns that evolve randomly from one period to another. A hierarchical framework with multiple time scales is proposed to model such time series. This framework leads to an adaptive procedure that provides both long-term (in days) and short-term (in minutes) predictions with simultaneous confidence bands that accommodate not only serial correlation, but also heavy tailedness, heteroscedasticity, and nonstationarity of the data.

[1]  Richard A. Davis,et al.  Time Series: Theory and Methods (2nd ed.). , 1992 .

[2]  Gang Wu,et al.  Applications of nonlinear prediction methods to the Internet traffic , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[3]  Kavitha Chandra,et al.  Time series models for Internet data traffic , 1999, Proceedings 24th Conference on Local Computer Networks. LCN'99.

[4]  Genshiro Kitagawa,et al.  An approach to the prediction of time series with trends and seasonalities , 1982, CDC 1982.

[5]  Wei Jin,et al.  USENIX Association Proceedings of USITS ’ 03 : 4 th USENIX Symposium on Internet Technologies and Systems , 2003 .

[6]  J. Contreras,et al.  Forecasting Next-Day Electricity Prices by Time Series Models , 2002, IEEE Power Engineering Review.

[7]  R. Shibata Asymptotically Efficient Selection of the Order of the Model for Estimating Parameters of a Linear Process , 1980 .

[8]  Krishna Kant,et al.  Server Capacity Planning for Web Traffic Workload , 1999, IEEE Trans. Knowl. Data Eng..

[9]  Min Wu,et al.  Dynamic resource allocation via video content and short-term traffic statistics , 2001, IEEE Trans. Multim..

[10]  Elizabeth A. Peck,et al.  Introduction to Linear Regression Analysis , 2001 .

[11]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[12]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[13]  G. Tunnicliffe Wilson,et al.  Fitting Time Series Models by Minimizing Multistep‐ahead Errors: a Frequency Domain Approach , 1997 .

[14]  Joseph L. Hellerstein,et al.  Predictive models for proactive network management: application to a production Web server , 2000, NOMS 2000. 2000 IEEE/IFIP Network Operations and Management Symposium 'The Networked Planet: Management Beyond 2000' (Cat. No.00CB37074).

[15]  D. Cox Prediction by Exponentially Weighted Moving Averages and Related Methods , 1961 .

[16]  Neville Davies,et al.  Time Series Models, 2nd Edn. , 1995 .

[17]  Virgílio A. F. Almeida,et al.  Capacity Planning for Web Services: Metrics, Models, and Methods , 2001 .

[18]  Simon Haykin,et al.  Adaptive filter theory (2nd ed.) , 1991 .

[19]  Yen-Wen Chen Traffic behavior analysis and modeling of sub-networks , 2002, Int. J. Netw. Manag..

[20]  Prashant J. Shenoy,et al.  Dynamic resource allocation for shared data centers using online measurements , 2003, IWQoS'03.

[21]  P. Young,et al.  Dynamic harmonic regression. , 1999 .

[22]  Jeffrey D. Hart,et al.  Nonparametric Smoothing and Lack-Of-Fit Tests , 1997 .

[23]  Andrew A. Weiss,et al.  Multi-step estimation and forecasting in dynamic models , 1991 .

[24]  R. J. Bhansali,et al.  Asymptotically efficient autoregressive model selection for multistep prediction , 1996 .

[25]  D. Findley ON SOME AMBIGUITIES ASSOCIATED WITH THE FITTING OF ARMA MODELS TO TIME SERIES , 1984 .

[26]  Mark S. Squillante,et al.  Web traffic modeling and Web server performance analysis , 1999, PERV.

[27]  Jerome A. Rolia,et al.  Characterizing the scalability of a large web-based shopping system , 2001, ACM Trans. Internet Techn..

[28]  Ta-Hsin Li,et al.  On Exponentially Weighted Recursive Least Squares for Estimating Time-Varying Parameters and its Application to Computer Workload Forecasting , 2008 .

[29]  Carey L. Williamson,et al.  Internet Web servers: workload characterization and performance implications , 1997, TNET.

[30]  Amarnath Mukherjee,et al.  Time series models for internet traffic , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[31]  Jae Hong Park,et al.  Composite modeling for adaptive short-term load forecasting , 1991 .

[32]  A. Harvey Time series models , 1983 .

[33]  G. Gross,et al.  Short-term load forecasting , 1987, Proceedings of the IEEE.

[34]  Saifur Rahman,et al.  Analysis and Evaluation of Five Short-Term Load Forecasting Techniques , 1989, IEEE Power Engineering Review.

[35]  M. Hinich,et al.  A statistical theory of signal coherence , 2000, IEEE Journal of Oceanic Engineering.

[36]  R. Adapa,et al.  Risk due to load forecast uncertainty in short term power system planning , 1998 .

[37]  Clive W. J. Granger,et al.  Short-run forecasts of electricity loads and peaks , 2001 .

[38]  G. C. Tiao,et al.  Robustness of maximum likelihood estimates for multi-step predictions: The exponential smoothing case , 1993 .

[39]  Philip Hans Franses,et al.  Time Series Models for Business and Economic Forecasting , 1998 .

[40]  Ta-Hsin Li,et al.  A Filter Bank Approach for Modeling and Forecasting Seasonal Patterns , 2002, Technometrics.

[41]  Joseph L. Hellerstein,et al.  An approach to predictive detection for service management , 1999, Integrated Network Management VI. Distributed Management for the Networked Millennium. Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management. (Cat. No.99EX302).

[42]  Jean-Chrysostome Bolot,et al.  Performance Engineering of the World Wide Web: Application to Dimensioning and Cache Design , 1996, Comput. Networks.

[43]  Liana L. Fong,et al.  Neptune: A Dynamic Resource Allocation and Planning System for a Cluster Computing Utility , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[44]  Ian T. Jolliffe,et al.  Introduction to Multiple Time Series Analysis , 1993 .

[45]  D. P. Sen Gupta,et al.  Short-term load forecasting for demand side management , 1997 .

[46]  Helmut Lütkepohl,et al.  Introduction to multiple time series analysis , 1991 .

[47]  Arye Nehorai,et al.  On multistep prediction error methods for time series models , 1989 .

[48]  Adrian E. Eckberg,et al.  Traffic characteristics of on-line services , 1997, Proceedings Second IEEE Symposium on Computer and Communications.

[49]  José R. Gallardo,et al.  Dynamic resource management considering the real behavior of aggregate traffic , 2001, IEEE Trans. Multim..