Discovering Multi-type Correlated Events with Time Series for Exception Detection of Complex Systems

With the increase of systems' complexity, exception detection becomes more important and difficult. For most complex systems, like cloud platform, exception detection is mainly conducted by analyzing a large amount of telemetry data collected from systems at runtime. Time series data and events data are two major types of telemetry data. Techniques of correlation analysis are important tools that are widely used by engineers for data-driven exception detection. Despite their importance, there has been little previous work addressing the correlations between two types of heterogeneous data for exception detection: continuous time series data and temporal events data. In this paper, we propose an approach to discovery the correlation between multi-type time series data and multi-type events data. Correlations between multi-type events data and multi-type time series data are used to detect systems' exceptions. Our experimental results on real data sets demonstrate the effectiveness of our method for exception detection.

[1]  Armando Fox,et al.  Capturing, indexing, clustering, and retrieving system history , 2005, SOSP '05.

[2]  Haifeng Chen,et al.  Time Series Segmentation to Discover Behavior Switching in Complex Physical Systems , 2015, 2015 IEEE International Conference on Data Mining.

[3]  Hui Xiong,et al.  Ranking Metric Anomaly in Invariant Networks , 2014, TKDD.

[4]  Thomas H. Morris,et al.  Classification of Disturbances and Cyber-Attacks in Power Systems Using Heterogeneous Time-Synchronized Data , 2015, IEEE Transactions on Industrial Informatics.

[5]  Magnus Almgren,et al.  Online temporal-spatial analysis for detection of critical events in Cyber-Physical Systems , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[6]  Peng Ning,et al.  False data injection attacks against state estimation in electric power grids , 2009, CCS.

[7]  Jing Lin,et al.  An angle-based subspace anomaly detection approach to high-dimensional data: With an application to industrial fault detection , 2015, Reliab. Eng. Syst. Saf..

[8]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[9]  Yizhou Sun,et al.  Multidimensional Analysis of Atypical Events in Cyber-Physical Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[10]  Chao Liu,et al.  Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs , 2005, SDM.

[11]  Xiangliang Zhang,et al.  A PCA-Based Change Detection Framework for Multidimensional Data Streams: Change Detection in Multidimensional Data Streams , 2015, KDD.

[12]  Saifur Rahman,et al.  Analyzing Invariants in Cyber-Physical Systems using Latent Factor Regression , 2015, KDD.

[13]  Albert G. Greenberg,et al.  Ananta: cloud scale load balancing , 2013, SIGCOMM.

[14]  Qiang Fu,et al.  Mining program workflow from interleaved traces , 2010, KDD.

[15]  Evimaria Terzi,et al.  Constructing comprehensive summaries of large event sequences , 2009, TKDD.

[16]  Eamonn J. Keogh,et al.  Efficient Long-Term Degradation Profiling in Time Series for Complex Physical Systems , 2015, KDD.

[17]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[18]  Qiang Fu,et al.  Software analytics for incident management of online services: An experience report , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[20]  Qiang Fu,et al.  Correlating events with time series for incident diagnosis , 2014, KDD.