Unsupervised Bayesian Nonparametric Approach with Incremental Similarity Tracking of Unlabeled Water Demand Time Series for Anomaly Detection

In this paper, a fusion of unsupervised clustering and incremental similarity tracking of hourly water demand series is proposed. Current research using unsupervised methodologies to detect anomalous water is limited and may possess several limitations such as a large amount of dataset, the need to select an optimal cluster number, or low detection accuracy. Our proposed approach aims to address the need for a large amount of dataset by detecting anomaly through (1) clustering points that are relatively similar at each time step, (2) clustering points at each time step by the similarity in how they vary from each time step, and (3) to compare the incoming points with a reference shape for online anomalous trend detection. Secondly, through the use of Bayesian nonparametric approach such as the Dirichlet Process Mixture Model, the need to choose an optimal cluster number is eliminated and provides a subtle solution for ‘reserving’ an empty cluster for the future anomaly. Among the 165 randomly generated anomalies, the proposed approach detected a total of 159 anomalies and other anomalous trends present in the data. As the data is unlabeled, identified anomalous trends cannot be verified. However, results show great potential in using minimally unlabeled water demand data for a preliminary anomaly detection.

[1]  Zhongfei Zhang,et al.  An Incremental DPMM-Based Method for Trajectory Clustering, Modeling, and Retrieval , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Xiaoting Wang,et al.  Using Correlation between Data from Multiple Monitoring Sensors to Detect Bursts in Water Distribution Systems , 2018 .

[3]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[4]  Ali Moeini,et al.  Forecasting monthly urban water demand using Extended Kalman Filter and Genetic Programming , 2011, Expert Syst. Appl..

[5]  Joaquín Izquierdo,et al.  Predictive models for forecasting hourly urban water demand , 2010 .

[6]  Dominic L. Boccelli,et al.  Real-time forecasting and visualization toolkit for multi-seasonal time series , 2018, Environ. Model. Softw..

[7]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[8]  Xue Wu,et al.  Burst detection in district metering areas using a data driven clustering algorithm. , 2016, Water research.

[9]  James L. Wescoat,et al.  Cluster analysis of urban water supply and demand: Toward large-scale comparative sustainability planning , 2016 .

[10]  Nicolas Cheifetz,et al.  Modeling and clustering water demand patterns from real-world smart meter data , 2017 .

[11]  Jing-qing Liu,et al.  Principal Factor Analysis for Forecasting Diurnal Water-Demand Pattern Using Combined Rough-Set and Fuzzy-Clustering Technique , 2013 .

[12]  Wei Liu,et al.  Detection and interpretation of anomalous water use for non-residential customers , 2018, Environ. Model. Softw..

[13]  Zoran Kapelan,et al.  Adaptive water demand forecasting for near real-time management of smart water distribution systems , 2014, Environ. Model. Softw..

[14]  Yuan Ji,et al.  Bayesian nonparametric clustering for large data sets , 2019, Stat. Comput..

[15]  Pierre-François Marteau,et al.  Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Zhu Han,et al.  Sensing-Transmission Edifice Using Bayesian Nonparametric Traffic Clustering in Cognitive Radio Networks , 2014, IEEE Transactions on Mobile Computing.

[17]  Antonio Candelieri Clustering and Support Vector Regression for Water Demand Forecasting and Anomaly Detection , 2017 .

[18]  Jiaqi Liu,et al.  A novel clustering method on time series data , 2011, Expert Syst. Appl..

[19]  Barak Fishbain,et al.  Water consumption patterns as a basis for water demand modeling , 2015 .

[20]  Rodney A. Stewart,et al.  ANN-based residential water end-use demand forecasting model , 2013, Expert Syst. Appl..

[21]  R. Fenner,et al.  Weighted Least Squares with Expectation-Maximization Algorithm for Burst Detection in U.K. Water Distribution Systems , 2014 .

[22]  Cheng Siong Chin,et al.  Review of Current Technologies and Proposed Intelligent Methodologies for Water Distributed Network Leakage Detection , 2018, IEEE Access.

[23]  Richard Mounce,et al.  Novelty detection for time series data analysis in water distribution systems using support vector machines , 2011 .