Branch and Border: Partition-Based Change Detection in Multivariate Time Series

Given multivariate time series data, how do we detect changes in the behavior of the time series: for example, the onset of illnesses or complications in patients? Can we do this without making strong assumptions about the data? We propose BnB (Branch and Border), an online, nonparametric change detection method that detects multiple changes in multivariate data. Unlike existing methods, BnB approaches change detection by separating points before and after the change using an ensemble of random partitions. BnB is (a) scalable: it scales linearly in the number of time ticks and dimensions, and is online, thus using bounded memory and bounded time per iteration; (b) effective: providing theoretical guarantees on the false positive rate, and achieving 70% or more increased F-measure over baselines in experiments averaged over 11 datasets; (c) general: it is nonparametric, and works on mixed data, including numerical, categorical, and ordinal data.

[1]  Xuan Liang,et al.  Assessing Beijing's PM2.5 pollution: severity, weather impact, APEC and winter heating , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[2]  Masashi Sugiyama,et al.  Change-point detection in time-series data by relative density-ratio estimation , 2012 .

[3]  Eamonn J. Keogh,et al.  Matrix Profile VIII: Domain Agnostic Online Semantic Segmentation at Superhuman Performance Levels , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4]  Piotr Fryzlewicz,et al.  Multiple‐change‐point detection for high dimensional time series via sparsified binary segmentation , 2015, 1611.08639.

[5]  Jeffrey D. Scargle,et al.  An algorithm for optimal partitioning of data on an interval , 2003, IEEE Signal Processing Letters.

[6]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[7]  Shawn T. Brown,et al.  Contagious diseases in the United States from 1888 to the present. , 2013, The New England journal of medicine.

[8]  Deborah Estrin,et al.  Using mobile phones to determine transportation modes , 2010, TOSN.

[9]  Xiangliang Zhang,et al.  A PCA-Based Change Detection Framework for Multidimensional Data Streams: Change Detection in Multidimensional Data Streams , 2015, KDD.

[10]  Mohammad Mahdi Khalilzadeh,et al.  Real Time Recognition of Heart Attack in a Smart Phone , 2015, Acta informatica medica : AIM : journal of the Society for Medical Informatics of Bosnia & Herzegovina : casopis Drustva za medicinsku informatiku BiH.

[11]  M. Kulldorff,et al.  A Space–Time Permutation Scan Statistic for Disease Outbreak Detection , 2005, PLoS medicine.

[12]  Richard G. Lathrop,et al.  Urban change detection based on an artificial neural network , 2002 .

[13]  Christos Anagnostopoulos,et al.  Edge-Centric Efficient Regression Analytics , 2018, 2018 IEEE International Conference on Edge Computing (EDGE).

[14]  David S. Matteson,et al.  A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data , 2013, 1306.4933.

[15]  E. Massera,et al.  On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario , 2008 .

[16]  A. Munk,et al.  Multiscale change point inference , 2013, 1301.7212.

[17]  Jean-Philippe Vert,et al.  The group fused Lasso for multiple change-point detection , 2011, 1106.4199.

[18]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  Graham J. Williams,et al.  On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms , 2000, KDD '00.

[20]  Stephen P. Boyd,et al.  Greedy Gaussian segmentation of multivariate time series , 2016, Advances in Data Analysis and Classification.

[21]  A. Scott,et al.  A Cluster Analysis Method for Grouping Means in the Analysis of Variance , 1974 .

[22]  Taposh Banerjee,et al.  Power system line outage detection and identification — A quickest change detection approach , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Pulak Ghosh,et al.  Dirichlet Process Hidden Markov Multiple Change-point Model , 2015, 1505.01665.

[24]  P. Francesco,et al.  Geodesic distance in planar graphs , 2003, cond-mat/0303272.

[25]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[26]  Diane J. Cook,et al.  A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.

[27]  Manuel Davy,et al.  An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[28]  Zhonghua Li,et al.  On-line monitoring data quality of high-dimensional data streams , 2016 .

[29]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[30]  Wei Jiang,et al.  An Efficient Online Monitoring Method for High-Dimensional Data Streams , 2015, Technometrics.

[31]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[32]  Tom Rohmer,et al.  Testing the constancy of Spearman’s rho in multivariate time series , 2014, 1407.1624.

[33]  Kai Ming Ting,et al.  Fast Anomaly Detection for Streaming Data , 2011, IJCAI.

[34]  Piotr Fryzlewicz,et al.  Wild binary segmentation for multiple change-point detection , 2014, 1411.0858.

[35]  Luis M. Candanedo,et al.  Accurate occupancy detection of an office room from light, temperature, humidity and CO2 measurements using statistical learning models , 2016 .