Linear-time Detection of Non-linear Changes in Massively High Dimensional Time Series

Change detection in multivariate time series has applications in many domains, including health care and network monitoring. A common approach to detect changes is to compare the divergence between the distributions of a reference window and a test window. When the number of dimensions is very large, however, the naive approach has both quality and efficiency issues: to ensure robustness the window size needs to be large, which not only leads to missed alarms but also increases runtime. To this end, we propose LIGHT, a linear-time algorithm for robustly detecting non-linear changes in massively high dimensional time series. Importantly, LIGHT provides high flexibility in choosing the window size, allowing the domain expert to fit the level of details required. To do such, we 1) perform scalable PCA to reduce dimensionality, 2) perform scalable factorization of the joint distribution, and 3) scalably compute divergences between these lower dimensional distributions. Extensive empirical evaluation on both synthetic and real-world data show that LIGHT outperforms state of the art with up to 100% improvement in both quality and efficiency.

[1]  Kenji Yamanishi,et al.  A unifying framework for detecting outliers and change points from time series , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  Noga Alon,et al.  The Space Complexity of Approximating the Frequency Moments , 1999 .

[3]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[4]  Jilles Vreeken,et al.  Non-parametric Jensen-Shannon Divergence , 2015, ECML/PKDD.

[5]  Per-Gunnar Martinsson,et al.  Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.

[6]  Davide Anguita,et al.  Human Activity Recognition on Smartphones with Awareness of Basic Activities and Postural Transitions , 2014, ICANN.

[7]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[8]  Edo Liberty,et al.  Simple and deterministic matrix sketching , 2012, KDD.

[9]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[10]  Manuel Davy,et al.  An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[11]  Xiangliang Zhang,et al.  A PCA-Based Change Detection Framework for Multidimensional Data Streams: Change Detection in Multidimensional Data Streams , 2015, KDD.

[12]  Saeed Amizadeh,et al.  Generic and Scalable Framework for Automated Time-series Anomaly Detection , 2015, KDD.

[13]  José Carlos Príncipe,et al.  A Unified Framework for Quadratic Measures of Independence , 2011, IEEE Transactions on Signal Processing.

[14]  Masashi Sugiyama,et al.  Change-Point Detection in Time-Series Data by Direct Density-Ratio Estimation , 2009, SDM.

[15]  Suresh Venkatasubramanian,et al.  Change (Detection) You Can Believe in: Finding Distributional Shifts in Data Streams , 2009, IDA.

[16]  Ludmila I. Kuncheva,et al.  PCA Feature Extraction for Change Detection in Multidimensional Unlabeled Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[17]  W. Marsden I and J , 2012 .

[18]  Charu C. Aggarwal A Framework for Change Diagnosis of Data Streams. , 2003, SIGMOD 2003.

[19]  Sanjay Ranka,et al.  Statistical change detection for multi-dimensional data , 2007, KDD '07.

[20]  Kuldip K. Paliwal,et al.  Fast principal component analysis using fixed-point algorithm , 2007, Pattern Recognit. Lett..

[21]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[22]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[23]  Zaïd Harchaoui,et al.  Kernel Change-point Analysis , 2008, NIPS.

[24]  Masashi Sugiyama,et al.  Change-point detection in time-series data by relative density-ratio estimation , 2012 .

[25]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[26]  Uwe Krüger,et al.  Canonical Correlation Analysis based on Hilbert-Schmidt Independence Criterion and Centered Kernel Target Alignment , 2013, ICML.

[27]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[28]  Jessika Weiss,et al.  Graphical Models In Applied Multivariate Statistics , 2016 .

[29]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[30]  Michael I. Jordan Graphical Models , 1998 .

[31]  Lawrence K. Saul,et al.  Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.

[32]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[33]  Klemens Böhm,et al.  Unsupervised interaction-preserving discretization of multivariate data , 2014, Data Mining and Knowledge Discovery.

[34]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix , 2006, SIAM J. Comput..

[35]  Klemens Böhm,et al.  4S: Scalable subspace search scheme overcoming traditional Apriori processing , 2013, 2013 IEEE International Conference on Big Data.

[36]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.