A statistical approach to predictive detection

Abstract Service providers typically define quality of service problems using threshold tests, such as “Are HTTP operations greater than 12 per second on server XYZ?” Herein, we estimate the probability of threshold violations for specific times in the future. We model the threshold metric (e.g., HTTP operations per second) at two levels: (1) non-stationary behavior (as is done in workload forecasting for capacity planning) and (2) stationary, time-serial dependencies. Our approach is assessed using simulation experiments and measurements of a production Web server. For both assessments, the probabilities of threshold violations produced by our approach lie well within two standard deviations of the measured fraction of threshold violations.

[1]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[2]  N. Draper,et al.  Applied Regression Analysis. , 1967 .

[3]  Marina Thottan,et al.  Adaptive thresholding for proactive network problem detection , 1998, Proceedings of the IEEE Third International Workshop on Systems Management.

[4]  Thomas G. Dietterich,et al.  Discovering Patterns in Sequences of Events , 1985, Artif. Intell..

[5]  Chuanyi Ji,et al.  Proactive network fault detection , 1997, Proceedings of INFOCOM '97.

[6]  Norman R. Draper,et al.  Applied Regression Analysis , 1968 .

[7]  Jay Lepreau,et al.  Computer System Performance Problem Detection Using Time Series Model , 1993, USENIX Summer.

[8]  Masaharu Kitamura,et al.  Anomaly detection by neural network models and statistical time series analysis , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[9]  G. Jay Lipovich Fixing Capacity Planning's Achilles Heel: An Approach To Managing Forecast Inaccuracy , 1997, Int. CMG Conference.

[10]  Symeon Papavassiliou,et al.  Adaptive network/service fault detection in transaction-oriented wide area networks , 1999, Integrated Network Management VI. Distributed Management for the Networked Millennium. Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management. (Cat. No.99EX302).

[11]  P. Vega,et al.  A neural networks based approach for fault detection and diagnosis: application to a real process , 1995, Proceedings of International Conference on Control Applications.

[12]  Roy A. Maxion,et al.  Anomaly detection for diagnosis , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[13]  F. Massey,et al.  Introduction to Statistical Analysis , 1970 .

[14]  Fan Zhang,et al.  Characterizing Normal Operation of a Web Server: Application to Workload Forecasting and Problem Determination , 1998, Int. CMG Conference.

[15]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[16]  Danny Raz,et al.  Minimizing the monitoring cost in network management , 1999, Integrated Network Management VI. Distributed Management for the Networked Millennium. Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management. (Cat. No.99EX302).

[17]  Shigeki Goto,et al.  Detecting Malicious Activities through Port Profiling , 1999 .

[18]  R. Plackett,et al.  Introduction to Statistical Analysis. , 1952 .

[19]  B. Zhang,et al.  Towards real time fault identification in plasma etching using neural networks , 1998, IEEE/SEMI 1998 IEEE/SEMI Advanced Semiconductor Manufacturing Conference and Workshop (Cat. No.98CH36168).

[20]  Rolf Isermann Process fault diagnosis based on process model knowledge , 1988 .