Different Approaches to SCADA Data Completion in Water Networks

This work contributes to the techniques used for SCADA (Supervisory Control and Data Acquisition) system data completion in databases containing historical water sensor signals from a water supplier company. Our approach addresses the data restoration problem in two stages. In the first stage, we treat one-dimensional signals by estimating missing data through the combination of two linear predictor filters, one working forwards and one backwards. In the second stage, the data are tensorized to take advantage of the underlying structures at five minute, one day, and one week intervals. Subsequently, a low-range approximation of the tensor is constructed to correct the first stage of the data restoration. This technique requires an offset compensation to guarantee the continuity of the signal at the two ends of the burst. To check the effectiveness of the proposed method, we performed statistical tests by deleting bursts of known sizes in a complete tensor and contrasting different strategies in terms of their performance. For the type of data used, the results show that the proposed data completion approach outperforms other methods, the difference becoming more evident as the size of the bursts of missing data grows.

[1]  Liqing Zhang,et al.  Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[3]  B. Recht,et al.  Tensor completion and low-n-rank tensor recovery via convex optimization , 2011 .

[4]  Harri Niska,et al.  Methods for imputation of missing values in air quality data sets , 2004 .

[5]  P. P. Vaidyanathan,et al.  The Theory of Linear Prediction , 2008, Synthesis Lectures on Signal Processing.

[6]  Dingsheng Wan,et al.  Research on the Data-Driven Quality Control Method of Hydrological Time Series Data , 2018, Water.

[7]  Bart Vandereycken,et al.  Low-rank tensor completion by Riemannian optimization , 2014 .

[8]  Pierre Comon,et al.  Canonical Polyadic Decomposition with a Columnwise Orthonormal Factor Matrix , 2012, SIAM J. Matrix Anal. Appl..

[9]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[10]  Pierre Comon,et al.  Tensors : A brief introduction , 2014, IEEE Signal Processing Magazine.

[11]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[12]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[13]  Zemin Zhang,et al.  Exact Tensor Completion Using t-SVD , 2015, IEEE Transactions on Signal Processing.

[14]  Walter Willinger,et al.  Spatio-Temporal Compressive Sensing and Internet Traffic Matrices (Extended Version) , 2012, IEEE/ACM Transactions on Networking.

[15]  Andrzej Cichocki,et al.  Tensor Decompositions for Signal Processing Applications: From two-way to multiway component analysis , 2014, IEEE Signal Processing Magazine.

[16]  Andrzej Cichocki,et al.  Brain-Computer Interface with Corrupted EEG Data: a Tensor Completion Approach , 2018, Cognitive Computation.

[17]  Jakub Langhammer,et al.  Applicability of a nu-support vector regression model for the completion of missing data in hydrological time series , 2016 .

[18]  S. Osher,et al.  Seismic data reconstruction via matrix completion , 2013 .

[19]  Vicenç Puig,et al.  A methodology and a software tool for sensor data validation/reconstruction : application to the Catalonia regional water network , 2016 .

[20]  M. Ahlheim,et al.  Towards a Comprehensive Valuation of Water Management Projects When Data Availability Is Incomplete—The Use of Benefit Transfer Techniques , 2015 .

[21]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Vicenç Puig,et al.  ARIMA Models for Data Consistency of Flowmeters in Water Distribution Networks , 2009 .

[23]  Johan A. K. Suykens,et al.  Tensor Versus Matrix Completion: A Comparison With Application to Spectral Data , 2011, IEEE Signal Processing Letters.

[24]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations for Incomplete Data , 2010, ArXiv.

[25]  Kun Xie,et al.  Missing Data Recovery Based on Tensor-CUR Decomposition , 2018, IEEE Access.

[26]  G. A. Blackburn,et al.  Infilling Missing Data in Hydrology: Solutions Using Satellite Radar Altimetry and Multiple Imputation for Data-Sparse Regions , 2018, Water.

[27]  Marko Filipovic,et al.  Tucker factorization with missing data with application to low-$$n$$n-rank tensor completion , 2015, Multidimens. Syst. Signal Process..

[28]  Jeffrey Humpherys,et al.  A Fresh Look at the Kalman Filter , 2012, SIAM Rev..

[29]  Morten Mørup,et al.  Applications of tensor (multiway array) factorizations and decompositions in data mining , 2011, WIREs Data Mining Knowl. Discov..

[30]  Fuwen Yang,et al.  Robust finite-horizon filtering for stochastic systems with missing measurements , 2005, IEEE Signal Processing Letters.

[31]  Andrzej Cichocki,et al.  Smooth PARAFAC Decomposition for Tensor Completion , 2015, IEEE Transactions on Signal Processing.

[32]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[33]  Maria Elisa Quinteros,et al.  Use of data imputation tools to reconstruct incomplete air quality datasets: A case-study in Temuco, Chile , 2019, Atmospheric Environment.

[34]  N. I. Miridakis,et al.  Linear Estimation , 2018, Digital and Statistical Signal Processing.

[35]  Louis Wehenkel,et al.  Data validation and missing data reconstruction using self-organizing map for water treatment , 2011, Neural Computing and Applications.

[36]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.