Monitoring High-Dimensional Data for Failure Detection and Localization in Large-Scale Computing Systems

It is a major challenge to process high-dimensional measurements for failure detection and localization in large-scale computing systems. However, it is observed that in information systems, those measurements are usually located in a low-dimensional structure that is embedded in the high-dimensional space. From this perspective, a novel approach is proposed to model the geometry of underlying data generation and detect anomalies based on that model. We consider both linear and nonlinear data generation models. Two statistics, that is, the Hotelling T2 and the squared prediction error (SPE), are used to reflect data variations within and outside the model. We track the probabilistic density of extracted statistics to monitor the system's health. After a failure has been detected, a localization process is also proposed to find the most suspicious attributes related to the failure. Experimental results on both synthetic data and a real e-commerce application demonstrate the effectiveness of our approach in detecting and localizing failures in computing systems.

[1]  P. Gemperline,et al.  Combination of the Mahalanobis distance and residual variance pattern recognition techniques for classification of near-infrared reflectance spectra , 1990 .

[2]  Marcos K. Aguilera,et al.  Using the Heartbeat Failure Detector for Quiescent Reliable Communication and Consensus in Partitionable Networks , 1999, Theor. Comput. Sci..

[3]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .

[4]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[5]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[6]  Sabine Van Huffel,et al.  The total least squares problem , 1993 .

[7]  T. Brotherton,et al.  Anomaly detection for advanced military aircraft using neural networks , 2001, 2001 IEEE Aerospace Conference Proceedings (Cat. No.01TH8542).

[8]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Sabine Van Huffel,et al.  Total least squares problem - computational aspects and analysis , 1991, Frontiers in applied mathematics.

[11]  L. Mirsky SYMMETRIC GAUGE FUNCTIONS AND UNITARILY INVARIANT NORMS , 1960 .

[12]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[13]  George Candea,et al.  Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for Failure Detection and Localization , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[14]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[15]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[16]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[17]  A. Höskuldsson PLS regression methods , 1988 .

[18]  Graham J. Williams,et al.  On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms , 2000, KDD '00.

[19]  J. Schmee An Introduction to Multivariate Statistical Analysis , 1986 .

[20]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[21]  M. J. Desforges,et al.  Applications of probability density estimation to the detection of abnormal conditions in engineering , 1998 .

[22]  Gene H. Golub,et al.  Regularization by Truncated Total Least Squares , 1997, SIAM J. Sci. Comput..

[23]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[24]  Matthew Brand,et al.  Charting a Manifold , 2002, NIPS.

[25]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[26]  Masaharu Kitamura,et al.  Anomaly detection by neural network models and statistical time series analysis , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[27]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[28]  Theodora Kourti,et al.  Multivariate SPC Methods for Process and Product Monitoring , 1996 .

[29]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[30]  Haifeng Chen,et al.  Discovering likely invariants of distributed transaction systems for autonomic system management , 2006, 2006 IEEE International Conference on Autonomic Computing.

[31]  Hisashi Kashima,et al.  Eigenspace-based anomaly detection in computer systems , 2004, KDD.

[32]  Haifeng Chen,et al.  Multi-resolution Abnormal Trace Detection Using Varied-length N-grams and Automata , 2005, ICAC.

[33]  Richard Mortier,et al.  Magpie: Online Modelling and Performance-aware Systems , 2003, HotOS.

[34]  Michael J. Piovoso,et al.  Process data chemometrics , 1991 .