Detection of Anomalies in Water Networks by Functional Data Analysis

A functional data analysis (FDA) based methodology for detecting anomalous flows in urban water networks is introduced. Primary hydraulic variables are recorded in real-time by telecontrol systems, so they are functional data (FD). In the first stage, the data are validated (false data are detected) and reconstructed, since there could be not only false data, but also missing and noisy data. FDA tools are used such as tolerance bands for FD and smoothing for dense and sparse FD. In the second stage, functional outlier detection tools are used in two phases. In Phase I, the data are cleared of anomalies to ensure that data are representative of the in-control system. The objective of Phase II is system monitoring. A new functional outlier detection method is also proposed based on archetypal analysis. The methodology is applied and illustrated with real data. A simulated study is also carried out to assess the performance of the outlier detection techniques, including our proposal. The results are very promising.

[1]  Vicenç Puig,et al.  Flow meter data validation and reconstruction using neural networks: Application to the Barcelona water network , 2016, 2016 European Control Conference (ECC).

[2]  Christian Bauckhage,et al.  Descriptive matrix factorization for sustainability Adopting the principle of opposites , 2011, Data Mining and Knowledge Discovery.

[3]  Charles W. Champ,et al.  Effects of Parameter Estimation on Control Chart Properties: A Literature Review , 2006 .

[4]  Ricardo Fraiman,et al.  Resistant estimates for high dimensional and functional data based on random projections , 2011, Comput. Stat. Data Anal..

[5]  Hyejin Shin,et al.  Functional outlier detection with robust functional principal component analysis , 2011, Computational Statistics.

[6]  Irene Epifanio,et al.  Shape Descriptors for Classification of Functional Data , 2008, Technometrics.

[7]  Thierry Denoeux,et al.  A neural network-based software sensor for coagulation control in a water treatment plant , 2001, Intell. Data Anal..

[8]  Manuel J. A. Eugster,et al.  Weighted and robust archetypal analysis , 2011, Comput. Stat. Data Anal..

[9]  Manuel J. A. Eugster,et al.  From Spider-man to Hero - archetypal analysis in R , 2009 .

[10]  Nicolas Cheifetz,et al.  Modeling and clustering water demand patterns from real-world smart meter data , 2017 .

[11]  F. J. Arregui,et al.  Burst Detection in Water Networks Using Principal Component Analysis , 2012 .

[12]  Douglas C. Montgomery,et al.  Statistical quality control : a modern introduction , 2009 .

[13]  Rob J. Hyndman,et al.  Robust forecasting of mortality and fertility rates: A functional data approach , 2007, Comput. Stat. Data Anal..

[14]  Pankaj K Choudhary,et al.  Tolerance bands for functional data. , 2016, Biometrics.

[15]  Fateh Chebana,et al.  Streamflow Hydrograph Classification Using Functional Data Analysis , 2016 .

[16]  Douglas C. Montgomery,et al.  A review of multivariate control charts , 1995 .

[17]  Mia Hubert,et al.  A Measure of Directional Outlyingness With Applications to Image Data and Video , 2016, 1608.05012.

[18]  Costas Papadimitriou,et al.  Leakage detection in water pipe networks using a Bayesian probabilistic framework , 2003 .

[19]  D. Gervini Outlier detection and trimmed estimation for general functional data , 2010, 1001.1014.

[20]  C. Ji An Archetypal Analysis on , 2005 .

[21]  Juan Romo,et al.  Shape outlier detection and visualization for functional data: the outliergram. , 2013, Biostatistics.

[22]  Irene Epifanio,et al.  Functional Data Analysis in Shape Analysis , 2011 .

[23]  Vicenç Puig,et al.  Validation and reconstruction of flow meter data in the Barcelona water distribution network , 2010 .

[24]  Spencer Graves,et al.  Functional Data Analysis with R and MATLAB , 2009 .

[25]  P. J. García Nieto,et al.  Detection of outliers in water quality monitoring samples using functional data analysis in San Esteban estuary (Northern Spain). , 2012, The Science of the total environment.

[26]  T. Auton Applied Functional Data Analysis: Methods and Case Studies , 2004 .

[27]  You Zhang,et al.  A dynamic water quality index model based on functional data analysis , 2015 .

[28]  Wenceslao González-Manteiga,et al.  A functional analysis of NOx levels: location and scale estimation and outlier detection , 2007, Comput. Stat..

[29]  Mia Hubert,et al.  mrfDepth: Depth Measures in Multivariate, Regression and Functional Settings , 2017 .

[30]  Amelia Simó,et al.  Archetypal shapes based on landmarks and extension to handle missing data , 2018, Adv. Data Anal. Classif..

[31]  Massimo Pacella,et al.  A comparison study of control charts for statistical monitoring of functional data , 2010 .

[32]  Didier Graillot,et al.  Fault detection on a sewer network by a combination of a Kalman filter and a binary sequential probability ratio test , 2000 .

[33]  Brent Henderson,et al.  Exploring between site differences in water quality trends: a functional data analysis approach , 2006 .

[34]  Irene Epifanio,et al.  Hippocampal shape analysis in Alzheimer's disease using functional data analysis , 2014, Statistics in medicine.

[35]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[36]  Irene Epifanio,et al.  Functional archetype and archetypoid analysis , 2016, Comput. Stat. Data Anal..

[37]  Joby Boxall,et al.  Development and Verification of an Online Artificial Intelligence System for Detection of Bursts and Other Abnormal Flows , 2010 .

[38]  D Burnell Auto-validation of district meter data , 2003 .

[39]  G Olsson,et al.  Failure monitoring in water distribution networks. , 2006, Water science and technology : a journal of the International Association on Water Pollution Research.

[40]  Sandra Alemany,et al.  Archetypal analysis: Contributions for estimating boundary cases in multivariate accommodation problem , 2013, Comput. Ind. Eng..

[41]  M. Febrero,et al.  Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels , 2008 .

[42]  Leonhard Held,et al.  Spatio-Temporal Analysis of Epidemic Phenomena Using the R Package surveillance , 2014, ArXiv.

[43]  Manuel Febrero-Bande,et al.  Statistical Computing in Functional Data Analysis: The R Package fda.usc , 2012 .

[44]  M. Genton,et al.  Functional Boxplots , 2011 .

[45]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[46]  Rob J Hyndman,et al.  Rainbow Plots, Bagplots, and Boxplots for Functional Data , 2010 .

[47]  Irene Epifanio,et al.  Archetypoid analysis for sports analytics , 2017, Data Mining and Knowledge Discovery.

[48]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[49]  Sandra Alemany,et al.  Archetypoids: A new approach to define representative archetypal data , 2015, Comput. Stat. Data Anal..

[50]  Guillermo Vinué,et al.  Anthropometry: An R Package for Analysis of Anthropometric Data , 2017 .

[51]  Mia Hubert,et al.  Multivariate and functional classification using depth and distance , 2017, Adv. Data Anal. Classif..